Eye tracking using OpenCV

In today's era, computer vision has been entangled with various industrial applications. From gaming to medical evaluation applications, computer vision is now a crucial part of these industries. Techniques such as object detection, image segmentation, and Facial Recognition have allowed various new methods for different applications.

What is computer vision?

Computer vision is a field of artificial intelligence dedicated to equipping machines with the ability to interpret and comprehend visual information. It encompasses the development of advanced algorithms and methodologies to process, analyze, and extract meaningful information from visual data, encompassing various tasks. The applications of computer vision span diverse industries, such as autonomous vehicles, healthcare, and surveillance. Among these applications, one intriguing aspect lies in eye tracking using OpenCV. This powerful tool enables machines to monitor and analyze human eye movements, leading to a wide range of fascinating applications.

Learn more in detail about computer vision.

In this Answer, we will see the implementation of eye tracking in any video using the OpenCV library and its functions.

Program implementation

In this implementation, we are going to detect a face through which we estimate the area plausible for the eyes. The basic overview of this program goes similar to this:

Convert each frame of the video into grayscale
Detect a face in the grayscale using the dlib library's get_frontal_face_detector() method
Define the region of interest (ROIs) dimensions which could possibly include the eyes
For each ROI, convert that area into a threshold image which helps in finding the contouring dimensions
For the dimensions of the contouring list, draw circles around the calculated eye position

Complete code

import cv2
import dlib

face_detector = dlib.get_frontal_face_detector()

cap = cv2.VideoCapture("https://player.vimeo.com/external/434418689.sd.mp4?s=90c8280eaac95dc91e0b21d16f2d812f1515a883&profile_id=165&oauth2_token_id=57447761")

while True:
    ret, frame = cap.read()
    
    if not ret:
        break

    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    faces = face_detector(gray)

    for face in faces:
        x, y, w, h = face.left(), face.top(), face.width(), face.height()

        roi_left_eye = gray[y + int(0.1 * h):y + int(0.4 * h), x + int(0.1 * w):x + int(0.5 * w)]
        roi_right_eye = gray[y + int(0.1 * h):y + int(0.4 * h), x + int(0.5 * w): x + int(0.9 * w)]

        _, thresh_left_eye = cv2.threshold(roi_left_eye, 30, 255, cv2.THRESH_BINARY)
        _, thresh_right_eye = cv2.threshold(roi_right_eye, 30, 255, cv2.THRESH_BINARY)
        
        contours_left, _ = cv2.findContours(thresh_left_eye, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
        contours_right, _ = cv2.findContours(thresh_right_eye, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
        
        for contour in contours_left:
            x_c, y_c, w_c, h_c = cv2.boundingRect(contour)
            center_x = x + int(0.1 * w) + x_c + w_c // 2 
            center_y = y + int(0.1 * h) + y_c + h_c // 2 
            radius = max(w_c, h_c) // 3 
            cv2.circle(frame, (center_x, center_y), radius, (0, 255, 0), 2)            

        for contour in contours_right:
            x_c, y_c, w_c, h_c = cv2.boundingRect(contour)
            center_x = x + int(0.5 * w) + x_c + w_c // 2 
            center_y = y + int(0.1 * h) + y_c + h_c // 2 
            radius = max(w_c, h_c) // 3
            cv2.circle(frame, (center_x, center_y), radius, (0, 255, 0), 2)


    cv2.imshow('Eye Tracking', frame)

    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Code explanation

Line 1: Import the cv2 and dlib library.
Line 4: Obtain the frontal face detector using the dlib.get_frontal_face_detector() function and store it in the variable face_detector.
Line 6: Create a video capture object named cap to read frames from the specified video URL.
Line 8: Start an infinite loop to process frames from the video continuously.
Line 14: Convert the current frame frame from BGR color to grayscale using cv2.cvtColor() and store the result in the variable gray.
Line 16 – 18: Start a loop to process each face detected by the face_detector in the grayscale frame gray and store the face rectangles in the faces list.
Line 19: Extract the coordinates (x, y), width (w), and height (h) of the current face rectangle.
Line 21 – 22: Define regions of interest (ROIs) corresponding to the left and right eyes. The ROIs are obtained by specifying certain percentages of the width and height of the face rectangle.
Line 24 – 25: Apply binary thresholding cv2.threshold() to the left and right eye ROIs to convert them into binary images thresh_left_eye and thresh_right_eye.
Line 27 – 28: Find contours in the binary images using cv2.findContours() and store them in contours_left and contours_right.
Line 30 – 42: Loop over each contour in contours_left and contours_right.
- For each contour, calculate the center and radius of the enclosing circle and draw the circle on the original frame using cv2.circle().
Line 45: Display the current frame with the detected eye centers and circles using cv2.imshow().
Line 50 – 51: Release the video capture object and close all OpenCV windows using cap.release() and cv2.destroyAllWindows(), respectively. This is done after the output is shown.

Free Resources