Search⌘ K
AI Features

Ensemble Learning: Part 1

Explore the concepts of ensemble learning in machine learning, including voting classifiers with hard and soft voting approaches, voting regressors, stacking, and blending. Understand how combining multiple models improves prediction performance and how to implement these techniques effectively.

Ensemble learning

Ensemble learning involves methods that combine the decisions of several models to improve the overall performance of predictive models. Kaggle competition winners commonly use ensemble methods.

Family of ensemble models

Voting classifier

In the voting classifier ensemble, multiple models, such as logistic regression and support vector machines, predict the label for an instance. This prediction is taken as a vote for the label. The label predicted by the majority of the classifiers has the maximum votes and is used as the final label.

Hard voting

In hard voting, the final label is decided based on the label predicted by the majority of the models. Suppose we have three models and the following predictions:

  • Model 1 predicts Class 1
  • Model 2 predicts Class 1
  • Model 3 predicts Class 2

Then hard voting would select the final class label to be 1. In case of a tie, hard voting selects the label based on the ascending vote order. If we have 2 models:

  • Model 1 predicts Class 2
  • Model 2 predicts Class 1

Then, hard voting will assign the class label to be 1.

Soft voting

In soft voting, we follow the steps below:

  • We assign a weight to each of the models.
  • We take the predicted class probabilities from each model.
  • Then, we multiply the predicted class probabilities by the model weight and take an average of all the products.
  • The final class label is chosen with the highest average probability.

Let’s assume we have four three-class classifiers, where we assign weights to each classifier in the following manner:

w1=2w_1 = 2, w2=3w_2 = 3 ...