Computer vision is one of the most crucial advancements in artificial intelligence. It provides us with another medium to communicate with the computer. Computer vision serves as the foundational concept that supports real-life applications such as autonomous cars, improved medical diagnosis, and facial recognition. Various supporting techniques have contributed to the field of computer vision, which include:
Facial recognition: Facial recognition is a specialized technique to identify and differentiate specific individuals within images. It plays a crucial role in applications such as biometric authentication and surveillance systems.
Object detection: Object detection identifies specific objects within an image. It can even recognize multiple objects simultaneously using bounding boxes.
Image segmentation: This technique involves dividing an image into separate regions or segments for individual analysis. It enables precise examination of different parts of an image.
Edge detection: By focusing on identifying the outer boundaries of objects or landscapes, edge detection enhances image understanding. It is commonly used in image processing tasks like edge enhancement and feature extraction.
These are some of the supporting techniques, and the recent advancements in deep learning, neural networks, and Artificial intelligence have propelled the field of computer vision to new heights, enabling remarkable applications deemed impossible at some time.
In this Answer, we'll be looking at how we can use these methods to track lanes on the road. These are primarily used in full and semi-autonomous vehicles such as the autopilot in Tesla, and lane assist in many cars today.
In this implementation, we aim to perform lane detection from a video stream, effectively separating the background (the road) from the lane markings. The process involves several steps as follows:
Convert the input image to grayscale and apply Gaussian blur
Perform edge detection using the Canny algorithm
Define the region of interest (ROI) containing the lane markings
Create a mask to extract the ROI and perform logical AND operation
In this implementation of the code, we will be using the following libraries:
OpenCV
pip install opencv-python
NumPy
pip install numpy
import cv2 import numpy as np def detect_lanes(image): gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY) blurred = cv2.GaussianBlur(gray, (5, 5), 0) edges = cv2.Canny(blurred, 80, 150) height, width = edges.shape roi_vertices = [(0, height*0.6), (width //2 , height*0.25), (width, height*0.6)] mask = np.zeros_like(edges) cv2.fillPoly(mask, np.array([roi_vertices], dtype=np.int32), 255) masked_edges = cv2.bitwise_and(edges, mask) lines = cv2.HoughLinesP(masked_edges, rho=1, theta=np.pi/180, threshold=25, minLineLength=10, maxLineGap=80) line_image = np.copy(image) * 0 if lines is not None: for line in lines: x1, y1, x2, y2 = line[0] cv2.line(line_image, (x1, y1), (x2, y2), (0, 0, 255), thickness=5) result = cv2.addWeighted(image, 1, line_image, 1, 0) return result cap = cv2.VideoCapture("https://player.vimeo.com/external/459184912.sd.mp4?s=e88d3fa809b08f64ca79e77a06e104680ddf04a0&profile_id=164&oauth2_token_id=57447761") while True: ret, frame = cap.read() if not ret: break result_image = detect_lanes(frame) cv2.imshow('Lane Detection', result_image) if cv2.waitKey(1) & 0xFF == ord('q'): break cap.release() cv2.destroyAllWindows()
Line 1: Import the necessary libraries, OpenCV (cv2) for computer vision tasks and NumPy (np) for numerical operations.
Line 3: Defines a function the detect_lanes()
for lane detection on an input image.
Line 4: Convert the input color image to grayscale using the cv2.cvtColor
.
Line 6: Apply Gaussian blur to the grayscale image using the cv2.GaussianBlur
to reduce noise.
Line 8: Perform edge detection on the blurred image using the Canny algorithm the cv2.Canny
to detect potential lane markings.
Line 10–12: Define the region of interest (ROI) by specifying the vertices of a polygon. The ROI contains the area with lane markings and is defined as a triangle covering the bottom portion of the frame.
Line 14: Create a binary mask of the edges with zeros (black) using the np.zeros_like
with the same shape as the edge-detected image.
Line 15: Fill the ROI area with white in the mask using the cv2.fillPoly
, while keeping the rest of the image black.
Line 17: Apply a bitwise AND operation between the edge-detected image and the mask. This operation keeps only the edges within the ROI, effectively extracting the lane markings.
Line 19: Use the Hough Line Transform the cv2.HoughLinesP
to detect lines in the masked edges.
Line 21–22: Create a blank line image the line_image
with the same shape as the input image and fill it with zeros. This image will be used to draw the detected lane lines.
Line 23–25: If lines are detected in the lines
list, draw them on the line image using the cv2.line
.
Line 27–29: Combine the original image with the line image using the addWeighted
function. The result is an image with detected lane markings overlaid on the original image, which is then returned.
Line 31: Create a video capture object cap
to read frames from the specified URL.
Line 32–36: Enter an infinite loop the while True
to reading the next frame from the video stream until the end of the video is reached. If the frame is not read successfully, break the loop.
Line 38: Call the detect_lanes
function to detect lanes in the current frame.
Line 40–45: Display the result_image
with the detected lane markings in a window.
Free Resources