This device is not compatible.


Customer Segmentation with K-Means Clustering

In this project, we’ll learn how to group customers based on similarities and differences using an unsupervised clustering model in Python. We’ll also visualize the resulting clusters in 3D.

Customer Segmentation with K-Means Clustering

You will learn to:

Load data in DataFrame and perform exploratory data analysis.

Perform data preprocessing including handling missing values and feature engineering.

Create an unsupervised learning model to segment customers.

Visualize the results of the clustering algorithm as an interactive graph.


Machine Learning

Data Science

Data Visualization


Hands-on experience with Python

Familiarity with unsupervised machine learning

Basic understanding of scikit-learn





Project Description

Customer segmentation aims to group customers into segments so businesses can tailor their marketing efforts, refine product offerings, and enhance the overall customer experience. This approach enables companies to move beyond one-size-fits-all strategies and instead deliver targeted and personalized interactions, ultimately leading to increased customer satisfaction and loyalty.

In this project, we’ll attempt the customer segmentation problem using the k-means clustering algorithm. We’ll also visualize the clusters to assess their proximity and interconnectivity. For this project, we’ll use the Online Retail dataset provided by the UCI ML repository. This dataset includes the online purchase history of a UK-based store for its wholesale customers from 2010–2011. Furthermore, the pandas library will be used for data preprocessing tasks, while the scikit-learn and Plotly libraries will serve as the primary tools for data clustering and visualization tasks.

By combining the power of data-driven techniques with the insights gained from customer segmentation, businesses can refine their strategies and foster stronger connections with their customer base. This project will demonstrate the practical application of k-means clustering and showcase the value of leveraging real-world datasets to extract meaningful insights for business decision-making. Through this analysis, we’ll provide a clear roadmap to implement customer segmentation strategies for improved marketing outcomes and customer satisfaction.

Project Tasks


Getting Started

Task 0: Get Started

Task 1: Import Libraries

Task 2: Load the Dataset

Task 3: Explore the Dataset


Data Preprocessing

Task 4: Drop Unnecessary Columns

Task 5: Treat Missing Values


Feature Engineering

Task 6: Calculate the Total Price per Item

Task 7: Calculate Recency of the Purchase

Task 8: Convert the Column’s Data Type

Task 9: Calculate the Purchase Frequency of a Customer

Task 10: Calculate the Monetary Value per Customer


k-Means Clustering

Task 11: Prepare the Data

Task 12: Find the Optimal Number of Clusters

Task 13: Cluster the Data

Task 14: Explore the Clusters

Task 15: Visualize the Clusters