DBSCAN Clustering and Customer Segmentation
Densitybased clustering is also one of the most widely used clustering algorithms which helps in detecting the outliers as in data. In this lesson, you can discover more about it.
We'll cover the following
DBSCAN clustering
The acronym DBSCAN stands for Density Based spatial clustering of Applications with Noise. It works on the analogy that clusters are the areas of high density separated by the areas of low density. Due to its property of considering clusters as areas of high density separated from areas of low density, it can deal with clusters of any shape unlike Kmeans clustering which assumes clusters are spherical, equally dense, and not contaminated by outliers.
It marks points as outliers or noise that lie alone in lowdensity regions (whose nearest neighbors are too far away). It also makes the assumption that there is noise in the dataset. Clusters in densitybased clustering satisfy the following properties:

All points in a cluster are mutuallydensity connected.

If a point is density reachable from some point of the cluster, it is also the part of the cluster.
Working of DBSCAN clustering
DBSCAN works in the following way.

It starts by identifying core samples or points in the dataset. A Core sample or point is the one that has at least min_samples or MinPts points around it within a distance of eps $\epsilon$.

Once we identify a core sample, we then examine its neighbors and add them to the cluster if they meet the core sample criteria.

Then, the cluster is expanded so that we can add noncore samples to it. These samples can be reached directly from the core samples within a distance of eps ϵ. However, they are not core samples themselves. These points are also called border points in some literature.

Once we have identified all the clusters, along with their core and noncore samples, the remaining samples are considered noise or outliers.
Get handson with 1200+ tech skills courses.