Clustering Algorithms Comparison

Learn about the comparison of three famous clustering algorithms.

An overview of three clustering algorithms

Clustering is an unsupervised learning technique that divides the dataset into distinct groups such that the points within each cluster are more similar to each other than they are to points in other clusters. There are many different clustering algorithms, and each has its own strengths and weaknesses. Here’s a comparison of three popular clustering algorithms: kk-means, DBSCAN, and agglomerative clustering.

KK-means

This is a centroid-based algorithm or a distance-based algorithm, where we calculate the distances to assign a point to a cluster. KK-means starts by randomly selecting kk initial centroids and then assigns each data point to the cluster corresponding to the nearest centroid. The centroids are then updated to the mean of the points in the corresponding cluster, and the process is repeated until convergence. One of the main advantages of kk-means is that it’s computationally efficient and easy to implement. However, it can get stuck in local optima, and it’s sensitive to the initial centroids chosen. Additionally, it assumes that the clusters are spherical and equally sized, which might not always be the case.

Example

Let’s try a scenario where kk-means performs better as compared to DBSCAN and agglomerative clustering:

Get hands-on with 1200+ tech skills courses.