Customer Segmentation
Learn how to segment customer bases using k-means clustering.
We'll cover the following
There are a number of unsupervised clustering algorithms, but k-means is one of the easiest. It can segment an unlabeled dataset into a predetermined number of groups. The input parameter k
stands for the number of clusters or groups we would like to form. However, if k
is too small, then the centroids won’t lie within the clusters. But if k
is too large, some of the clusters may be oversplit.
Implementing k-means clustering
The k-means algorithm follows these steps:
Choose the number of clusters (
k
).Randomly assign centroids for each cluster.
Assign each observation to a cluster for which the centroid is the closest based on the similarity or distance measures.
Compute a new centroid for each cluster.
Repeat steps 3 and 4 as long as the centroids keep changing.
Get hands-on with 1400+ tech skills courses.