Feature Engineering: Remove Redundant Features
Learn how to build a pairwise correlation matrix and remove redundant features.
We'll cover the following
In this lesson, we start with the encoded Telco dataset, where we have 31 features and a target. Each of the 7,043 records describes a customer’s subscription details. Now we must identify and remove any features that are redundant by analyzing the relationships between them.
Pairwise correlation
A pairwise correlation matrix is a great way to identify interfeature dependencies. It’s essentially a table describing the correlation coefficient for all possible pairs of values. Since the matrix is not intuitive for visualization, heatmaps are often used to depict the matrix.
Fitting a matrix of 32 x 32 is not an easy job. Because of the limited drawing space, we will focus on the 12 features that require some attention.
Get hands-on with 1400+ tech skills courses.