Unsupervised Learning
Understand the importance of unsupervised learning tasks such as clustering and dimensionality reduction and implement them using the sklearn library.
Unsupervised learning aims to find patterns and structures within the given data. Learning algorithms in this category work on the input features without labels, i.e., the features are provided to unsupervised methods with no corresponding output labels.
The figure above differentiates supervised and unsupervised learning. Supervised learning (left) shows two-dimensional data points (
There are two main types of unsupervised learning:
Clustering (grouping of the data)
Dimensionality reduction
Clustering
Clustering algorithms group the data into different categories based on similar features. Let’s assume there are only two features in a given dataset. The labels of the input data are not known. A clustering algorithm finds similar data points and groups them together. At the testing stage, the learned/trained model checks the test input against every group. We assign the test point to the cluster that has data points similar to the test point.
K-means clustering
K-means is one of the most popular clustering algorithms. It’s an iterative algorithm that tries to find the best cluster for each training data point in each iteration. It proceeds as follows:
We decide the value of
...