Search⌘ K
AI Features

Overfitting and Underfitting

Explore the concepts of overfitting and underfitting in supervised learning. Understand how model complexity impacts training and testing errors, and learn to balance bias and variance for better generalization. This lesson shows practical implementation using polynomial regression to visualize these effects and introduces strategies to prevent overfitting.

In machine learning, the ultimate goal isn't just to memorize past data; it's to make accurate predictions on new, unseen data. If a model performs perfectly on the data it was trained on but fails when deployed in the real world, it’s useless.

The concepts of overfitting and underfitting describe two major failure modes when attempting to achieve this crucial ability, which we refer to as generalization.

What is overfitting?

Overfitting is a modeling error where the model learns the training data (including its accidental irregularities or noise) too well, failing to capture the broad, underlying pattern. This results in a model that performs exceptionally well on the training data but poorly on any new data.

We can relate this to rote learning in a student: if a student only memorizes the solution to a specific practice problem, they might ace that problem, but if the problem is slightly changed (new data), they fail because they missed the underlying mathematical principle (the pattern).

  • Cause: We choose a model that is too flexible (has too many parameters) relative to the size and complexity of the training data. For example, a 10-degree polynomial is far more flexible than a 2-degree polynomial.

  • Result: High performance on training data, but low performance (high error) on testing data

The concept of model flexibility and its trade-off is often best seen through the lens of Polynomial Regression, where we use higher powers of the input feature (x,x2,x3,x, x^2, x^3, \dots ...