Search⌘ K
AI Features

K-Means on Two-Dimensional Data

Explore how to apply K-Means clustering on a two-dimensional dataset using Python. Learn to preprocess data, create clusters without labels, and visualize results with scatter plots. This lesson introduces the essentials of unsupervised learning and cluster analysis for two features, preparing you for more complex multidimensional clustering.

We'll cover the following...

K-means in Python

We do not need to code the above algorithm because it is available in sklearn.cluster. We will be clustering on a dummy dataset first. The dummy dataset has three columns, feature_1, feature_2, and label. The dataset has 4 classes, which mean each row of the data set can have a label from 0, 1, 2, or 3. First, we will plot a scatter plot of the two features.

dummy.csv
Python 3.5
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('dummy.csv')
print(df.head())
X = df.drop(columns = ['label'])
Y = df['label']
plt.scatter(x= X['feature_1'],y = X['feature_2'])

We drop the label column in line 6 because in reality, we do ...