Gradient Descent Algorithm

Explore the gradient descent algorithm, a key method to minimize error functions in neural networks by iteratively adjusting link weights. Understand step size control, challenges of local minima, and how this optimization process improves model accuracy.

We'll cover the following...

Find the minimum on the curve
Control step size to avoid anomalies
Avoid local minima
Key points

Find the minimum on the curve

Imagine that a complex landscape is a mathematical function. What the gradient descent method gives us is the ability to find the minimum without actually having to understand that complex function enough to work it out mathematically. If a function is so difficult that we can’t easily find the minimum using algebra, we can use this method instead. It might not give us the exact answer, because we’re using steps to approach an answer and improving our position bit by bit. However, that is better than not having an answer at all. We can keep refining the answer with ever smaller steps toward the actual minimum, until we’re happy with the accuracy we’ve achieved.

What’s the link between this really cool gradient descent method and neural networks? If the complex difficult function is the error of the network, then going downhill to find the minimum means we’re minimizing the error and therefore, improving the network’s output.

Let’s look at this gradient descent idea with a super simple example so we can understand it properly.

The following graph shows the simple function $y = (x-1)^2 + 1$ . If this was a function where $y$ ...

1.Prologue

2.A Little Background

3.Let's Get Started!

4.Backward Propagation of Error

5.Adjusting the Link Weights

6.A Gentle Start with Python

7.Neural Network with Python

8.Testing Neural Network against MNIST Dataset

9.Some Suggested Improvements

10.Even More Fun!

11.Epilogue

12.Appendix: A Small Guide to Calculus

Gradient Descent Algorithm

Find the minimum on the curve