An Introductory Guide to Data Science and Machine Learning/

...

Regularization (Lasso, Ridge, and ElasticNet Regression)

Learn more about Regularization. Specifically, it helps us deal with overfitting problems in Machine Learning models.

We'll cover the following...

Regularization

How high variance (overfitting) can be reduced

Ridge Regression

Ridge Regression in Scikit Learn

Lasso Regression

Lasso Regression in Scikit-Learn

Elastic Net Regression

Implementation in Scikit Learn

Regularization

We use overfitting to describe when the model learning is performing well on the training dataset but fails to generalize on the unseen or test dataset. This condition is also mentioned because the model is suffering from high variance. Overfitting on the training data can be illustrated as:

$J(w) \approx 0$

In other words, our predicted values are so close to the actual values, that the cost goes to zero and the model has memorized everything.

How high variance (overfitting) can be reduced

The first strategy is to look for more training data so that the data has more variety in it.
Regularization, which will be the focus of this part of the lesson is also used to tackle overfitting.
Employ good Feature Selection techniques.
There are also some specific Deep Learning techniques for reducing the high variance.

Now, we will look into how various Regularizations are used to overcome overfitting.

Ridge Regression

The following steps demonstrate how the cost function is modified in Ridge Regression, sometimes called L2-Regularization.

$J(w)$ = $\frac{1}{2m}[\sum_{i=1}^{m}(\hat{y}^i-y^i)^2 + \lambda \sum_{j=1}^{n}w_j^2]$ ...

What is Data Science ?

Applications of Data Science

Overview of Libraries

Probability and Statistics

Machine Learning Part-1

Machine Learning Part-2

Machine Learning Part-3

Deep Learning

Machine Learning Tools and Libraries

Big Data Tools and Technologies

Where to go next ?

Regularization (Lasso, Ridge, and ElasticNet Regression)

Regularization

How high variance (overfitting) can be reduced

Ridge Regression