Overfitting Explained

Learn what overfitting is, the major reasons behind overfitting, and how it differs from underfitting.

About this chapter

Overfitting has been creating problems throughout the course. In this chapter, we’ll finally solve the problem of overfitting in our network.

To refresh our memory, a system that overfits is like a student who learns by rote memory. They might be good at solving familiar problems from textbooks, but they will struggle when confronted with new problems. Likewise, an overfitting system could be useful at classifying its training data, and then fail when classifying data it has not seen before.

One strategy to work around overfitting is to split our data into training, validation, and test sets. We should use the training set to train the system, the validation set to tune its performance, and the test set for a final check-up. That way, we can test the system on previously unseen data and get a reliable, overfitting-free measure.

That testing strategy works, but it’s short term. It does not eliminate overfitting.It just prevents overfitting from polluting our metrics. Unfortunately, overfitting has worse consequences than imprecise metrics. Like that student mentioned earlier, an overfitting system is good at memorizing but bad at generalizing. We experience such problems when we build a deep network. That network reaches perfect accuracy on the training set, but it does worse than its shallow counterpart on the validation set.

We’ll now investigate the causes of overfitting and its subtler consequences. Later on, we’ll apply a few methods to solve overfitting by regularization techniques. With those techniques, we’ll finally deliver the power of our deep neural network.

Before we talk about reducing overfitting, let’s get to know the concept of overfitting in detail. In this lesson, we’ll dig into the causes of overfitting. What’s happening under the hood of an overfitting neural network? To fully grok overfitting, we should also understand its opposite process called underfitting. we’ll also discuss underfitting in this lesson. Let’s start by investigating the causes of overfitting. To fully grok overfitting, we should also understand its opposite: underfitting. In this lesson, we will take a look at underfitting as well. Let’s start by investigating the causes of overfitting.

The causes of overfitting

So far, we explained overfitting with vague metaphors like “a student memorizing the textbooks.” It’s time to learn about supervised learning and understand how overfitting really happens.

Imagine that we want to predict the number of customers at a hot dog stand, starting from a bunch of historical samples. As usual, we split the samples into training, validation, and test sets. For example, the following graph shows a slice of training data that spans the first three weeks of January. The horizontal axis is the day, and the vertical axis is the average number of customers per hour:

Get hands-on with 1200+ tech skills courses.