Explore the Dataset
Explore the Telco customer churn dataset by analyzing its features and target label. Learn to use Python libraries to visualize data relationships and identify key factors influencing churn. This lesson helps you understand patterns related to gender, tenure, contract type, and additional services to prepare for effective churn prediction models.
In this lesson, we’ll explore the dataset used to predict customer churn. We will use a semiprocessed Telcom dataset (telco_customer_churn.csv) that comprises 7043 customer subscription details. The dataset has 20 features (both numerical and categorical), and the target label for us is the Churn column. It indicates whether a customer terminates the contract with the Telco company in the following month. Let’s get familiar with the dataset.
Feature Details
Features | Data Type | Details |
CustomerID | string | Identifier of a customer |
Gender | string | Indicates the gender of the customer (Male, Female) |
SeniorCitizen | string | Indicates if the customer is above 65 (Yes, No) |
Partner | string | Indicates if the customer has a partner (Yes, No) |
Dependents | string | Indicates if the customer has dependents (Yes, No) |
Tenure | int64 | Number of months the customer is with the company |
PhoneService | string | Whether the customer subscribes to phone service (Yes, No) |
MultipleLines | string | Whether the customer has multiple phone lines (Yes, No) |
InternetService | string | Whether the customer subscribes to internet service (No, DSL, Fiber optic) |
OnlineSecurity | string | Whether the customer subscribes to online security (No, No internet service, Yes) |
OnlineBackup | string | Whether the customer subscribes to online backups (No, No internet service, Yes) |
DeviceProtection | string | Indicates if the customer subscribes to device protection (No, No internet service, Yes) |
TechSupport | string | Indicates if the customer subscribes to tech support (No, No internet service, Yes) |
StreamingTV | string | Indicates if the customer has a TV streaming service (No, No internet service, Yes) |
StreamingMovies | string | Indicates if the customer has a movie streaming service (No, No internet service, Yes) |
Contract | string | Indicates the customer’s contract type (Month-to-Month, One Year, Two Year) |
PaperlessBilling | string | Indicates if the customer opted for paperless billing (Yes, No) |
PaymentMethod | string | Customer's payment method (4 types) |
MonthlyCharges | float64 | Customer's total monthly service charges |
TotalCharges | float64 | Customer's total quarterly charges |
Churn | int64 | Whether the customer terminated the subscription this quarter (0, 1) |
Loading the dataset
Let’s load the telco_customer_churn.csv dataset and take a look at the top five records to get a better understanding of the data.
Explanation
Line 1 loads the necessary Python libraries.
Line 3 reads the Telco dataset (
telco_customer_churn.csv) and creates a pandas DataFrame.Lines 6 and 8 print the shape of the dataset and the top ...