Feature Engineering: Remove Redundant Features

Learn how to build a pairwise correlation matrix and remove redundant features.

In this lesson, we start with the encoded Telco dataset, where we have 31 features and a target. Each of the 7,043 records describes a customer’s subscription details. Now we must identify and remove any features that are redundant by analyzing the relationships between them.

Pairwise correlation

A pairwise correlation matrix is a great way to identify interfeature dependencies. It’s essentially a table describing the correlation coefficient for all possible pairs of values. Since the matrix is not intuitive for visualization, heatmaps are often used to depict the matrix.

Fitting a matrix of 32 x 32 is not an easy job. Because of the limited drawing space, we will focus on the 12 features that require some attention.

Get hands-on with 1200+ tech skills courses.