K-Means Clustering

Explore the k-means clustering algorithm which partitions data into clusters by minimizing variance. Learn its iterative approach through Lloyd's algorithm and improvements with k-means++ initialization. Understand variants such as k-medoids, which uses actual data points as centers, and k-center clustering aimed at minimizing cluster diameter. Gain skills to apply and differentiate clustering methods effectively in real datasets.

We'll cover the following...

Objective
- Variance computation code example
K-means is NP-hard
Lloyd’s algorithm
- K-means++
K-means variants
- K-medoids clustering
- K-center clustering
Conclusion

Traditionally, in machine learning, we start with the popular partitional clustering algorithm called $k$ -means clustering. This algorithm divides the data into $k$ clusters based on a similarity score. The objective is to minimize the total variance of the $k$ clusters. The number of clusters, $k$ , must be specified.

Note: The choice of similarity metric is a hyperparameter.

Objective

K-means clustering aims to partition a dataset into $k$ clusters such that the total variance of the clusters is minimized. This means we want the data points within each cluster to be as close to each other as possible.

Given a set of $n$ data points $D=\{\mathbf{x}_1, \mathbf{x}_2, \dots, \mathbf{x}_n\}$ ...

1.Course Overview

2.Supervised Learning

Project

3.Clustering

Mini Project

4.Generalized Linear Regression

Mini Project

5.Support Vector Machine

6.Logistic Regression

7.Ensemble Learning

Mini Project

8.Decoding Dimensions: PCA and Autoencoders

Mini Project

Mini Project

Mini Project

9.Appendix

10.Wrapping Up

Project

K-Means Clustering

Objective