...

Effect of Regularization

Learn about the effects of regularization on standardized data.

We'll cover the following...

Overview
Read data
Standardize predictors
Separate target and predictor matrices
Demonstration of effects and visualization

Let's learn the effect of regularization using a wine quality dataset. One of the important aspects, and a reason to work with this data, is the presence of multicollinearity in it.

Overview

The wine quality dataset is a multiclass classification problem since our target column has classes of wine represented by a class number. However, we are using this dataset as a linear regression problem for learning purposes. Multicollinearity can lead to a variety of problems, including:

The effect of predictor variables (features) estimated by our regression will depend on what other variables are included in the model.
Predictors can have wildly different effects depending on the observations, and small changes can result in very different estimated effects.
With very high multicollinearity, the computer-calculated inverse matrix may not be accurate.
We can no longer interpret a coefficient on a variable as the effect on the target of a one-unit increase in that variable, holding the other variables constant. This is because when predictors are strongly correlated, there is no scenario in which one variable can change without a conditional change in another variable.

The ridge is best suited to deal with multicollinearity. Lasso also deals with multicollinearity between variables, but in a more brutal way (it zeroes out the less effective variable). The lasso is particularly useful when there are redundant or unimportant features in the data. Suppose, we have 1,000 features in a dataset. Lasso can perform feature selection automatically by forcing coefficients of the least important ...

Course Introduction

Linear Regression

Regularization

Bias-Variance Trade-off

Categorical Features

Logistic Regression

Logistic Regression: Titanic Data

Sentiment Analysis Using Multinomial Logistic Regression

Multiclass Classification and Handling Imbalanced Classes

Project: Predicting Chronic Kidney Disease

K-Nearest Neighbors

Implementation of K-Nearest Neighbors

Logistic Regression vs. KNN

Decision Tree Learning

Implement the Decision Tree Classifier from Scratch

Bootstrapping and Confidence Interval

Support Vector Machine

Practice and Comparisons

What's Next?

Appendix

Effect of Regularization

Overview