Effect of Feature Scaling

Scale the features for model training.

If we recall the theory of KNN, it is a distance-based algorithm, and feature scaling is an important step that we intentionally skipped. Let’s see if we can improve the model performance with feature scaling.

Effect of feature scaling on KNN

KNN makes decisions by identifying the votes from the training data points nearest to the test data point (majority voting based on the k value). In such a situation, the scale of the features does matter. Variables with a larger scale will significantly affect the distance between observations and the KNN classifier as compared to the variables on a smaller scale. We need to do feature scaling first so that they are on the same scale to deal with this issue.

We also know that scikit-learn has built-in functionality to do feature scaling. We need to import StandardScaler from scikit-learn and also need to create a StandardScaler() object. Let’s do this.

Get hands-on with 1200+ tech skills courses.