Optimization for Machine Learning with NumPy and SciPy/

...

Quasi-Newton Methods

Learn how to approximate Newton’s methods when Hessians are difficult or expensive to compute.

We'll cover the following...

The Broyden-Fletcher-Goldfarb-Shanno (BFGS) method
Implementation of the BFGS algorithm

Here, $H(x)$ is the Hessian of the function $f(x)$ at the point $x$ . Recall that the complexity of computing the Hessian is of the order $O(m^2)$ as opposed to the gradient, which is of the order $O(m)$ . This means that as the dimensionality ( $m$ ) of $x$ increases, the cost of computing the Hessian increases quadratically. Furthermore, the update rule in Newton’s method involves the inverse of the Hessian matrix, which can also become computationally expensive to calculate, especially for high-dimensional problems.

The key idea behind quasi-Newton methods is to approximate the inverse of Hessian using only the first-order derivative information, i.e., the gradient. This makes these methods more computationally efficient than Newton’s method, especially for problems with many variables. The update rule of quasi-Newton methods is the same as Newton’s method, simply replacing the Hessian inverse $H^{-1}(x_{t-1})$ with its approximation $B_{t-1}$ as follows:

The Broyden-Fletcher-Goldfarb-Shanno (BFGS) method

One of the most popular quasi-Newton methods is the BFGS method. The BFGS algorithm approximates the true Hessian inverse by starting with an initial guess. Then, in subsequent iterations, it updates its approximation of the Hessian inverse to a better estimate using an update rule.

The update rule ensures that the updated approximation of the Hessian inverse satisfies the secant equation, which is a necessary condition for the updated approximation to be a good estimate of the true Hessian. The secant equation is given as follows:

Introduction to Optimization

Vector Calculus

Convex Optimization

Gradient Descent for Non-Convex Optimization

Use Particle Swarm Optimizer to Optimize a Non-convex Function

Constrained Optimization

Miscellaneous Methods

Course Conclusion

Test Your Concepts of Optimization

Training Support Vector Machines (SVMs)

Quasi-Newton Methods

The Broyden-Fletcher-Goldfarb-Shanno (BFGS) method