Occam’s Razor
Explore the principle of Occam's Razor in supervised learning to understand why simpler models are preferred when performance is similar. Learn how this guides the bias-variance trade-off to reduce overfitting and underfitting, helping you choose models that generalize better on unseen data.
We'll cover the following...
In the previous lesson, we saw that overly complex models tend to overfit (high variance), whereas overly simple models tend to underfit (high bias). This trade-off raises a fundamental question: if two models perform equally well on unseen data, which one should we choose? The answer lies in the philosophical principle known as Occam’s Razor, which guides us toward selecting the most straightforward explanation possible.
What is Occam’s Razor?
Occam’s Razor is a problem-solving principle credited to the 14th-century English philosopher, William of Ockham.
It states that out of two or more competing theories, the simpler theory is to be preferred.
In the context of machine learning, this translates directly: a simple model should be preferred over a complex one, provided both achieve similar predictive performance (generalization). Choosing the simpler model offers practical benefits: it is easier to interpret, faster to train, and less likely to overfit future data.
What is a simple model?
A machine learning model is a
For example, let’s take a task for which we have two models and . This is shown in the figure below:
The machine learning task ...