Understanding the world of computer vision starts with grasping the fundamental techniques that power image recognition and processing. The Hough transform is one such technique. An incredible tool for detecting shapes in an image, it is a cornerstone of modern image processing.
The Hough transform is a popular method in image analysis and digital image processing for extracting features. Its main purpose is to identify partially formed instances of objects belonging to a particular class of shapes using a voting process. This technique is widely used to detect simple shapes like lines, circles, ellipses, etc.
OpenCV is a collection of programming functions primarily designed for real-time computer vision tasks. It is highly optimized and extremely efficient. When combined with the Hough transform, OpenCV allows for robust and precise shape detection.
Before we dive into feature matching, it's important to have OpenCV installed. We can install it using pip
:
pip install opencv-python
OpenCV can implement the Hough transform on a simple image and extract line segments.
In this step, we import all the necessary libraries. Here, cv2
is the OpenCV library, numpy
is a library for handling arrays (which images essentially are), and matplotlib
is for displaying images.
import cv2import numpy as npimport matplotlib.pyplot as plt
Load the image and convert it into grayscale. The reason for conversion to grayscale is that edge detection, which will be used later, works more effectively on grayscale images than on color images.
input_image = cv2.imread('image.jpg')gray = cv2.cvtColor(input_image, cv2.COLOR_BGR2GRAY)
Here, apply the Canny edge detection method, which helps identify areas of the image with abrupt pixel intensity changes, denoting edges.
edge_image = cv2.Canny(gray, 50, 150, apertureSize=3)
After performing edge detection, apply the Hough transform to the detected edges. The Hough transform will return an array of parameters of the detected lines in the form of
detected_lines = cv2.HoughLines(edge_image, 1, np.pi / 180, 200)
In this step, iterate through all the detected lines and draw them on the original image.
for line in detected_lines:for rho, theta in line:cos_theta = np.cos(theta)sin_theta = np.sin(theta)x_0 = cos_theta * rhoy_0 = sin_theta * rhox_1 = int(x_0 + 1000 * (-sin_theta))y_1 = int(y_0 + 1000 * (cos_theta))x_2 = int(x_0 - 1000 * (-sin_theta))y_2 = int(y_0 - 1000 * (cos_theta))cv2.line(output_image, (x_1, y_1), (x_2, y_2), (0, 0, 255), 2)
The OpenCV library reads images in BGR format by default, but matplotlib's imshow()
function expects images in RGB format. So, convert the original and the processed image to RGB format before displaying them.
# Convert the processed image from BGR to RGBoutput_image_rgb = cv2.cvtColor(output_image, cv2.COLOR_BGR2RGB)# Convert the original image from BGR to RGBoriginal_image_rgb = cv2.cvtColor(cv2.imread('image.jpg'), cv2.COLOR_BGR2RGB)
Finally, use matplotlib
to display the original and processed images side by side. The original image is displayed on the left, and the processed image with detected lines is on the right.
plt.figure(figsize=(15, 10))plt.subplot(1, 2, 1)plt.title('Original Image')plt.imshow(original_image_rgb)plt.axis('off')plt.subplot(1, 2, 2)plt.title('Image with Hough Lines')plt.imshow(output_image_rgb)plt.axis('off')plt.show()
Here's the complete executable code implementing the above steps:
import cv2 import numpy as np import matplotlib.pyplot as plt # Load the image and convert it to grayscale input_image = cv2.imread('image.jpg') gray = cv2.cvtColor(input_image, cv2.COLOR_BGR2GRAY) # Perform edge detection edge_image = cv2.Canny(gray, 50, 150, apertureSize=3) # Apply the Hough Transform detected_lines = cv2.HoughLines(edge_image, 1, np.pi / 180, 200) # Draw the detected lines on the image output_image = input_image.copy() for line in detected_lines: for rho, theta in line: cos_theta = np.cos(theta) sin_theta = np.sin(theta) x_0 = cos_theta * rho y_0 = sin_theta * rho x_1 = int(x_0 + 1000 * (-sin_theta)) y_1 = int(y_0 + 1000 * (cos_theta)) x_2 = int(x_0 - 1000 * (-sin_theta)) y_2 = int(y_0 - 1000 * (cos_theta)) cv2.line(output_image, (x_1, y_1), (x_2, y_2), (0, 0, 255), 2) # Convert the processed image from BGR to RGB output_image_rgb = cv2.cvtColor(output_image, cv2.COLOR_BGR2RGB) # Convert the original image from BGR to RGB original_image_rgb = cv2.cvtColor(cv2.imread('image.jpg'), cv2.COLOR_BGR2RGB) # Display original and output images plt.figure(figsize=(15, 10)) plt.subplot(1, 2, 1) plt.title('Original Image') plt.imshow(original_image_rgb) plt.axis('off') plt.subplot(1, 2, 2) plt.title('Image with Hough Lines') plt.imshow(output_image_rgb) plt.axis('off') plt.show()
Here’s a line-by-line breakdown of the code:
Line 1–3: We import the required libraries. OpenCV (cv2
) is used for image processing, NumPy (np
) for numerical operations and matplotlib.pyplot
(plt
) for visualizing the images.
Line 6–7: The image is loaded using cv2.imread
and then converted to grayscale using cv2.cvtColor
. Grayscale is used because the Canny edge detection requires a grayscale image as input.
Line 10: We perform edge detection using the Canny algorithm. It is utilized for detecting a diverse range of edges present in the images. The two numbers 50
and 150
are the thresholds for the hysteresis procedure in the Canny algorithm.
Line 13: This line performs the Hough transform on the edge_image
, which is the result of edge detection on a grayscale image. The function identifies lines in the image by converting them into polar coordinates (rho
and theta
). The parameters 1
and np.pi / 180
specify the distance resolution and angle resolution used in the Hough space. The value 200
represents the threshold, which determines the minimum number of votes required to consider a line as a detected line.
Line 16–28: These lines iterate through each line detected by the Hough transform and draw it onto the original image. The rho
and theta
values represent the distance and angle of the line, respectively. A line in the image space can be expressed with two variables - rho
and theta
. The np.cos
and np.sin
functions are used to calculate the x and y coordinates of the two points defining the line.
Line 31: Here, we convert the BGR image to RGB. OpenCV loads images in BGR format by default but matplotlib
displays images in RGB. Thus, we need to convert the images.
Line 34: We load the original image and convert it to RGB.
Line 37–47: These lines of code are for visualizing the original and processed images side by side using matplotlib
. plt.figure
is used to create a new figure, plt.subplot
is used to add a subplot to the figure, plt.title
is used to set a title for the subplot, plt.imshow
is used to display an image in the subplot, and plt.axis(‘off’)
is used to turn off the axis.
Line 49: plt.show()
is used to display the figure with the two plots.
The Hough transform is a simple, yet powerful way to find lines in images. Using Python and OpenCV, even beginners can apply this method and start their journey in the exciting world of image processing. Play around with the parameters to better understand how they impact the results.
Here are some more OpenCV tutorials: