Pipeline

Explore how to integrate multiple machine learning steps such as feature selection, dimension reduction, and model training into one efficient workflow using Scikit-Learn's Pipeline and FeatureUnion. Understand how chaining transformers and estimators improves clarity and reduces rework in your projects.

We'll cover the following...

- Combine features from different spaces
- Connect different steps as pipeline

A full Machine Learning project involves many steps; data cleaning, data processing, feature transformation, dimension reduction, feature extraction, model build, model training, model evaluation, and so on. If you look at it from a data flow perspective, the output of the last step is often the input of the next step. If we can connect these steps, it can not only make our steps clearer but also reduce our re-work and improve efficiency.

sklearn provides a very useful module, pipeline, that allows you to chain multiple estimators into one. This is useful as there is often a fixed sequence of steps in processing the data like feature selection, normalization, and classification. The module is very simple and only contains a few functions. Let’s see how to use it.

1.Preliminaries

2.Working with Datasets

3.Feature Engineering

4.General Concepts

5.Linear Regression

6.Logistic Regression

7.Support Vector Machine

8.Tree Model and Ensemble Method

9.Unsupervised Learning

10.Deep Learning

11.Others

12.What's Next

Pipeline

Combine features from different spaces