Object Detection in the Images
Explore how to implement object detection in images using Azure Computer Vision API. Learn to detect objects with location coordinates and confidence scores. Understand how to use detect_objects and detect_objects_in_stream functions to analyze images from both URLs and local files, and how to visualize detected objects with bounding boxes.
We'll cover the following...
Introduction
In this lesson, we’re going to build a script to detect objects in an image using the Computer Vision API. The Computer Vision API detects the objects in an image and returns the
Implementation
We’ll be using the detect_objects() function from the ComputerVisionClient class. This function accepts the URL of an image and helps to detect all the objects with their corresponding locations from an image. The response is in the JSON format and contains the objects identified with their location coordinates and the confidence score with which these objects are identified from the image. If an error occurs on the computer vision end, this function will return an error code with an error message.
Later in the lesson, we’ll also explore the detect_objects_in_stream() function. This function works the same way that detect_objects() works. The only difference is that the detect_objects_in_stream() function accepts an image in bytes format. So, we can read the image from the local machine and pass the image data to this function to identify the objects present in the image.
Now let’s move on to the implementation of this functionality.
Explanation
The explanation of main.py is given below:
-
From lines 1-4, we import the required packages.
-
In lines 6-9, we create an instance of the
ComputerVisionClientclass and pass the endpoint and subscription key to its constructor. -
In line 11, we define the URL to fetch the image(You can replace this URL with the one with your image).
-
In line 13, we call the
detect_objects()function using theclientobject that we have just created in the above steps and passed the image URL. -
In line 15, we check whether the computer vision resource identified any object or not.
-
If it has identified an object, then in line 20, we call the
draw_rectangle()function to create a rectangle over the objects identified and then from lines 22-26, we print the type of the object that has been identified, the confidence score with which the object is identified, and the coordinates of the object that has been detected by our computer vision resource.
The explanation of draw.py is given below:
-
From lines 1-3, we import the required packages.
-
From line 5-11, we create the coordinates to generate a rectangle on top of the objects identified.
-
From lines 13-21, we create a
draw_rectangles()function which first reads the image from the URL and then iterate over all the object identified. Each object is then marked with a red bounding box.
So, in this way, we can use the Computer Vision’s Object Detection capabilities.
The only difference, is in lines 11, 13, and 15 in the file main.py, we read the image from the local system and then use the detect_objects_in_stream()function.