Search⌘ K
AI Features

Relationships Between Dependent and Independent Variables

Discover how to analyze relationships between features and the target variable in customer revenue prediction. Learn to use correlation matrices and scatterplots to identify key predictors and understand their impact. This lesson guides you in selecting optimal features for improving regression models and prepares you to evaluate the effects of outliers on prediction accuracy.

Relationships between features and the label

In the previous lesson, we started working on an online retail transaction dataset and performed feature engineering techniques. In this lesson, we’ll build on that by analyzing the wrangled dataset and observing the relationship between the features and the dependent variable (label). This will help us choose a subset of features to optimize the regression model.

First, let's import the necessary libraries and the wrangled dataset.

Python
import pandas as pd
import numpy as np
import datetime as dt
import matplotlib.pyplot as plt
import seaborn as sns
# load the wrangled dataset
df_retail = pd.read_csv('wrangled_transactions.csv', header=0, index_col='customer_id')
print(df_retail.head())
print(df_retail.shape)

Corre

...