...

Scaling Data, Pipelines, and Interaction Features in Scikit-Learn

Learn about the function for scaling data, the creation of pipelines, and interaction features in scikit-learn.

We'll cover the following...

Scaling data
Pipelines
Interaction features

Press + to interact

A simple way to make sure that all the features have the same scale is to put them all through the transformation of subtracting the minimum and dividing by the range from minimum to maximum. This transforms each feature so that it will have a minimum of 0 and a maximum of 1. To instantiate the MinMaxScaler scaler that does this, we can use the following code:

from sklearn.preprocessing import MinMaxScaler 
min_max_sc = MinMaxScaler()

Pipelines

Previously, we used a logistic regression model ...

Introduction

Data Exploration and Cleaning

(Challenge) Exploring Remaining Financial Features in Dataset

Introduction to scikit-learn and Model Evaluation

Fake News Detection Using Scikit-learn

(Challenge) Logistic Regression and Precision-Recall Curve

Details of Logistic Regression and Feature Extraction

(Challenge) Logistic Regression Model and Coefficients

The Bias-Variance Trade-Off

(Challenge) Cross-Validation and Feature Engineering

Decision Trees and Random Forests

(Challenge) Cross-Validation Grid Search with Random Forest

Gradient Boosting, XGBoost, and SHAP Values

(Challenge) XGBoost and SHAP Explanation for Case Study Data

Predict Frog Toxicity with Python and XGBoost

Test Set Analysis, Financial Insights, and Delivery to the Client

(Challenge) Deriving Financial Insights

Appendix

Scaling Data, Pipelines, and Interaction Features in Scikit-Learn

Scaling data

Pipelines