Lasso (L1) and Ridge (L2) Regularization
Explore how lasso (L1) and ridge (L2) regularization methods work in logistic regression models. Understand their roles in penalizing coefficients to reduce overfitting, perform feature selection, and improve model generalization. Learn about practical considerations like scaling features, solver choice, and intercept treatment to effectively apply these techniques in Python using scikit-learn.
Before applying regularization to a logistic regression model, let’s take a moment to understand what regularization is and how it works. The two ways of regularizing logistic regression models in scikit-learn are called lasso (also known as L1 regularization) and ridge (also known as L2 regularization). When instantiating the model object from the scikit-learn class, you can choose penalty = 'l1' or 'l2'. These are called “penalties” because the effect of regularization is to add a penalty, or a cost, for having larger values of the coefficients in a fitted logistic regression model.
As we’ve already learned, coefficients in a logistic regression model describe the relationship between the log odds of the response and each of the features. Therefore, if a coefficient value is particularly large, then a small change in that feature will have a large effect on the prediction.
When a model is being fit and is learning the relationship between features and the response variable, the ...
Log-loss equation with lasso penalty
Lasso and ridge regularization use different mathematical formulations to accomplish this goal. These methods work by making changes to the ...