Search⌘ K

DBSCAN

Explore the DBSCAN clustering algorithm to identify high-density regions as clusters and low-density regions as noise. Understand core samples, neighbors, and how to configure parameters like epsilon and minimum samples in scikit-learn for scalable clustering without assuming cluster shapes.

Chapter Goals:

  • Learn about the DBSCAN algorithm

A. Clustering by density

The mean shift clustering algorithm in the previous chapter usually performs sufficiently well and can choose a reasonable number of clusters. However, it is not very scalable due to computation time and still makes the assumption that clusters have a "blob"-like shape (although this assumption is not as strong as the one made by K-means).

Another clustering algorithm that also automatically chooses the number of clusters is ...