Embrace Pessimism

Get an intuitive introduction to the gradient descent algorithm.

Limitations of the mathematical model

We can see that the brute-force approach isn’t practical at all. In fact, it gets worse very quickly as we add network layers, nodes, or possibilities for weight values.

This puzzle resisted mathematicians for years, and was only really solved in a practical way in the 1960s–70s. There are different opinions about who did it first or made the key breakthrough, but the important point is that this discovery led to the explosion of modern neural networks, which can carry out some very impressive tasks.

So, how do we solve such a difficult problem? Believe it or not, we’ve already got the tools to do it ourselves. We covered all of them earlier. So, let’s get on with it.

The first thing we must do is embrace pessimism.

The mathematical expressions showing how all the weights result in a neural network’s output are too complex to easily untangle. There are too many weight combinations to test one by one and find the best.

There are even more reasons to be pessimistic. There might not be enough training data might to properly teach a network. The training data might have errors, so our assumption that it is true and is something to learn from is flawed. The network itself might not have enough layers or nodes to model the correct solution to the problem.

What this means is that we must take an approach that is realistic and recognizes these limitations. If we do that, we might find an approach that isn’t mathematically perfect, but actually gives us better results because it doesn’t make false, idealistic assumptions.

Going downhill for an optimal solution

Let’s illustrate what we mean by this. Let’s imagine a very complex landscape with peaks and valleys, and hills with treacherous bumps and gaps. It’s dark, and we can’t see anything. We know we’re on the side of a hill, and we need to get to the bottom. We don’t have an accurate map of the landscape. We do have a flashlight. What do we do? We’ll probably use the flashlight to look at the area close to our feet. We can’t use it to see much further anyway, and certainly not the entire landscape. We can see which bit of earth seems to be going downhill and take small steps in that direction. In this way, we can slowly work our way down the hill, step by step, without having a full map and without having worked out our journey beforehand.

Get hands-on with 1200+ tech skills courses.