Root Mean Square Propagation (RMSProp)

Explore how RMSProp improves gradient descent by adapting learning rates with an exponential moving average of squared gradients. This lesson helps you understand its advantages over AdaGrad and how RMSProp accelerates convergence in non-convex optimization problems.

We'll cover the following...

How does RMSProp work?
Implementation of RMSProp

Root Mean Square Propagation (RMSProp) is an adaptive learning rate optimization algorithm designed to address the shortcomings of the gradient descent algorithm.

The limitation of AdaGrad is that the adaptive learning rate decreases monotonically with time and, therefore, takes too long to converge. RMSProp, on the other hand, seeks to adapt the learning rate without monotonically decreasing the learning rate like AdaGrad.

How does RMSProp work?

The key idea behind RMSProp is to keep track of a limited number of previously squared gradients rather than all of them, as in AdaGrad. This is achieved by exponentially weighted moving average of the squared gradients. By using an exponential moving average, RMSProp avoids the issue of continually shrinking learning rates.

The update rule of RMSProp at a time $t$ is given as follows:

1.Introduction to Optimization

2.Vector Calculus

3.Convex Optimization

4.Gradient Descent for Non-Convex Optimization

Project

5.Constrained Optimization

6.Miscellaneous Methods

7.Course Conclusion

Assessment

Mini Project

Root Mean Square Propagation (RMSProp)

How does RMSProp work?