What is overfitting?

If the goal is to only adapt to training data, we can choose a model with many parameters (large parameter space) and fit the training data nearly precisely. A model with many parameters is more flexible in adapting to complex patterns. For example, a 10-degree polynomial is more flexible than a 2-degree polynomial. When the training data has accidental irregularities, often termed as noise, a more flexible model runs after the noise to fit the data exactly and often misses the underlying pattern in the data. Overfitting is a modeling error where the model aligns too closely with the training data and might not generalize well to unseen data. This means that the model performs exceptionally well on the training data but is unsuitable for other data. In real life, we can relate this to rote learning. If a student memorizes the solution to a specific problem, they’ll undoubtedly perform well in that problem. However, if the problem is changed, the student won’t perform well.

In machine learning, we try to limit overfitting as much as we can.

Note: We want our model to perform well on the training data but also avoid overfitting.

Press + to interact

Course Overview

Supervised Learning

Detect Cyber Intrusion Using Machine Learning

Clustering

Project: Bag of Visual Words

Generalized Linear Regression

Face Recognition Using Kernel Linear Discriminant

Support Vector Machine

Logistic Regression

Ensemble Learning

Early Stage Diabetes Prediction Using Ensemble Learning

Decoding Dimensions: PCA and Autoencoders

Image Reconstruction Using PCA

Image Colorization using Autoencoders

Colorful Face Generation with VAEs

Appendix

Wrapping Up

How to Predict the Traffic Volume Using Machine Learning

Overfitting and Underfitting

What is overfitting?

Try it yourself

Import packages

Function to be approximated