Newton's Method

Learn about one of the most famous second-order optimization algorithms.

Second-order optimization methods use the second derivatives of a function in each iteration. That’s the only difference from first-order iterative methods. These methods require the target function to be not just differentiable but doubly differentiable. That is, the target function should have both its first and second derivative well defined.

Besides this stronger requirement, when applicable, second-order methods offer a greater convergence velocity than first-order methods. We’ll learn about the classic version of Newton’s method. This is probably the most famous second-order optimization algorithm and is the seed of many variants designed to overcome some of its limitations.

But first, let’s talk about second-order derivatives and their meaning.

Note: Newton’s method is also known as the Newton-Raphson method.

Interpretation of second-order derivatives

It’s time to give more importance to these derivatives. We’ve talked a little bit about the Hessians but we’ve mostly just ignored them. And that’s a little unfair, because second-order derivatives measure an important property of the functions.

In the same way that the first derivative and the gradient tell us about where a function is increasing or decreasing, the second derivative and the Hessian tell us about how the function is curved.

A univariate function with a positive second derivative is convex, and if the second derivative is negative, then the function is concave.

Get hands-on with 1200+ tech skills courses.