Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

computer vision
object detection
segmentation

What is panoptic segmentation?

Muhammad Nabeel

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Introduction

One of the most revolutionizing uses of Computer Vision is object detection. It is used in a number of real-world applications that maintain security and saves a lot of human effort.

In object detection, we not only classify the object in the image but also draw a bounding box around it. Two outputs are generated:

  • Labels of the objects
  • Their bounding boxes

These bounding boxes consist of x and y coordinates of the object's midpoint and the box's length and width.

In computer vision, segmentation divides the image into regions that belong to a class. Panoptic segmentation is a combination of instance segmentation and semantic segmentation. So, to understand panoptic segmentation, let's first look at instance segmentation and semantic segmentation.

Instance segmentation

A single image can contain multiple instances of an object. Instance segmentation assigns each instance with a binary mask to differentiate between those multiple instances. A binary mask is the same size as the original image. It has the binary value one in all the pixels' places that belongs to the instance and zeroes in the rest of the places. Thus, we now have three outputs, as follows:

  • Labels of the objects
  • Their bounding boxes
  • A binary mask for each instance
The instance segmentation
The instance segmentation

Semantic segmentation

Semantic segmentation assigns a class to every pixel in the image. Thus, even if multiple instances of the object exist, they would be considered one. This allows us to generate a prediction map similar to the binary map, but now it includes all the classes encoded in a unique way. This means that each pixel will have an assigned value corresponding to the class label. Outputs for semantic segmentation are as follows:

  • Labels of the objects
  • Their bounding boxes
  • A prediction map
The semantic segmentation
The semantic segmentation

Panoptic segmentation

Panoptic segmentation combines both instance and semantic segmentation. It gives a unique label to the classes as well as their instances. Thus, each pixel is now encoded with two things, namely the label of the class it belongs to and the instance number.

There are two types of objects. They are as follows:

  • Things: Objects that are quantifiable, such as people, cats, dogs, and so on.
  • Stuff: Objects that are not quantifiable, such as pavement, sky, and so on.
The panoptic segmentation
The panoptic segmentation

Applications

The panoptic segmentation task is used in many fields where human-like detection is needed. Some of its applications are as follows:

Autonomous vehicles: These take multiple images from different angles to understand the objects in their way and plan their route accordingly. Panoptic segmentation is suitable for this since it helps get information that is more detailed than other segmentation techniques.

Medical imaging: Detecting anomalies is very crucial in medical diagnostics. Manual diagnostics are prone to human error. Thus, we use computer vision to precisely detect the anomalies. Using panoptic segmentation, we can not only detect tumors but also find the size of these tumors.

Digital image processing: Panoptic segmentation can be used to digitally process images to create an object tracker, text-to-image search facility, and other such applications that require image manipulation.

RELATED TAGS

computer vision
object detection
segmentation

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Keep Exploring