Search⌘ K

Robust Scaling

Explore how to apply robust scaling using scikit-learn's RobustScaler to preprocess data without being affected by outliers. Learn the importance of median and interquartile range in achieving reliable scaling for machine learning projects.

Chapter Goals:

  • Learn how to scale data without being affected by outliers

A. Data outliers

An important aspect of data that we have to deal with is outliers. In general terms, an outlier is a data point that is significantly further away from the other data points. For example, if we had watermelons of weights 5, 4, 6, 7, and 20 pounds, the 20 pound watermelon is an outlier. ...