Clustering Countries by Population

Learn how to cluster data based on an indicator using sklearn.

We will first understand this with one indicator that we are familiar with (population) and then make it interactive. We’ll cluster groups of countries based on their population.

Let’s start with a possible practical situation. Imagine we were asked to group countries by population. We are supposed to have two groups of countries with high and low populations.

  • How do we do that?
  • Where do we draw the line(s)?
  • What does the total population have to be in order for it to qualify as high?

Imagine that we were then asked to group countries into three or four groups based on their population.

  • How would we update our clusters?

We can easily see how KMeans clustering is ideal for that.

KMeans using one dimension

Let’s now do the same exercise with KMeans using one dimension and then combine that with our knowledge of mapping, as follows:

Get hands-on with 1200+ tech skills courses.