Learn about the regression penalties.

Removing predictors from the model can be seen as setting their coefficients to zero.

Ridge penalty

Instead of forcing them to be exactly zero, let’s penalize their coefficients if they are too far from zero. This continuously forces them to be small. This way, we decrease model complexity while keeping all variables in the model. This is what ridge regression does. Ridge is especially good at improving the least squares estimate when multicollinearity is present in the data. Ridge regression adds the sum of the squared β\beta (excluding intercept β0\beta_0) values to the loss function:

Get hands-on with 1200+ tech skills courses.