Usage of clustering in exploratory analysis

Because of the usefulness of clustering in exploratory analysis across a vast number of domains, including health, finance, information technology, etc., there have been thousands of clustering algorithms and approaches proposed over the past half a century or so. Such methods can be generally grouped into two types of clustering approaches based on how the underlying model operates, namely partitional and hierarchical, as depicted in the below figure.

Partitional clustering

Partitional clustering aims to partition the data into disjoint subsets, each of which makes up a cluster. There are three types of partitional clustering: centroid, distribution, and density. Some experts argue that these should be treated as individual groups, creating four basic groups of clustering methods. Hybrid clustering also exists, which makes classification even trickier.

Hierarchical clustering (HC)

HC(Hierarchical Clustering) approaches aim to construct a partition tree, each node of which corresponds to a subset of the data. In this tree, the root node comprises the whole dataset. There are nesting relationships between the clusters (or nodes in the tree), with child nodes nested within parent nodes. We will provide a bit more detail on each group below.

Get hands-on with 1200+ tech skills courses.