What is overfitting?

If the goal is to only adapt to training data, we can choose a model with many parameters (large parameter space) and fit the training data nearly precisely. A model with many parameters is more flexible in adapting to complex patterns. For example, a 10-degree polynomial is more flexible than a 2-degree polynomial. When the training data has accidental irregularities, often termed as noise, a more flexible model runs after the noise to fit the data exactly and often misses the underlying pattern in the data. Overfitting is a modeling error where the model aligns too closely with the training data and might not generalize well to unseen data. This means that the model performs exceptionally well on the training data but is unsuitable for other data. In real life, we can relate this to rote learning. If a student memorizes the solution to a specific problem, they’ll undoubtedly perform well in that problem. However, if the problem is changed, the student won’t perform well.

In machine learning, we try to limit overfitting as much as we can.

Note: We want our model to perform well on the training data but also avoid overfitting.

Get hands-on with 1200+ tech skills courses.