Explore the Dataset
Explore the retail transaction dataset to understand its structure, identify outliers, and examine key features like Quantity, UnitPrice, and Country. Learn to clean and focus data for building accurate machine learning models that predict customer spending.
We'll cover the following...
In this lesson, we’ll start working on a machine learning problem where we have to predict the amount of money an individual customer will spend in the coming year. To work on this problem, we’ll be using a dataset that contains the transaction records of an online retail shop.
Here is some basic information about the dataset.
Feature Details
Feature | Data Type | Details |
InvoiceNo | Integer | Invoice number of the transaction |
StockCode | String | Unique code of individual product |
Description | String | Description of a product |
Quantity | Integer | Quantity sold in a transaction |
InvoiceDate | String | Date and time of the transaction |
UnitPrice | Integer | Unit price of a product |
CustomerID | String | Unique identifier of a customer |
Country | String | Country of residence of customer |
Let’s familiarize ourselves with the dataset through exploratory data analysis (EDA).
Data exploration
Import the dataset first. We should always check the dimensions and the data types of each ...