Hierarchical Clustering

Learn all about hierarchical clustering and how to cluster data with it using scikit-learn.

Hierarchical clustering is another popular unsupervised clustering algorithm that groups data points into clusters based on similarity. It works by building a hierarchy of clusters, starting with individual data points and gradually merging them into larger clusters.

There are two types of hierarchical clustering: agglomerative and divisive.

Agglomerative clustering

Agglomerative clustering is a hierarchical clustering algorithm that groups data points based on their pairwise distances or similarities. Unlike k-means or DBSCAN, agglomerative clustering doesn’t require specifying the number of clusters in advance. Instead, it builds a hierarchy of clusters by iteratively merging the most similar or nearby data points or clusters.

The algorithm starts by considering each data point as a separate cluster. It then repeatedly merges the two closest clusters based on a chosen linkage criterion, which determines the distance or similarity between clusters. The most commonly used linkage criteria are as follows:

  • Ward: This minimizes the variance of the distances between the clusters being merged.

  • Complete: This maximizes the distance between the closest points of the clusters being merged.

  • Average: This uses the average distance between all pairs of points in the two clusters being merged.

The choice of linkage criterion can have a significant impact on the clustering results, as it affects the shape and structure of the clusters.

Agglomerative clustering continues merging clusters until all data points are grouped into a single cluster or until a stopping criterion is met. This stopping criterion can be a specified number of clusters, a distance threshold, or the maximum number of iterations. The resulting hierarchy of clusters can be represented as a dendrogram, which illustrates the merging process and allows for different levels of granularity in cluster identification.

Agglomerative clustering offers several advantages. It can handle clusters of different sizes, shapes, and densities. It can also detect nested or overlapping clusters, capturing the hierarchical structure of the data. Agglomerative clustering is flexible and can be applied to various types of data and distance/similarity metrics. Additionally, it provides a natural way to explore the data at different levels of granularity, allowing for the identification of both fine-grained and coarse-grained clusters.

Get hands-on with 1200+ tech skills courses.