The Newton-Conjugate Gradient (NCG) method is an iterative optimization approach to find the minimum of a scalar-value object function. A scalar-valued objective function is a mathematical function that takes one or more input variables and returns a single real number as its output. In other words, it maps a vector of inputs to a scalar value.
The NCG method integrates the strength of Newton’s and the Conjugate Gradient methods, providing convergence for a broad range of optimization problems in an efficient way.
Let’s expand on the mathematical formulation of the NCG method.
Consider an objective function
Where:
Combining the
Let’s understand the NCG method with the help of an example:
Find the minimum value of the following objective function using the NCG method:
Hovering over the plot above, we can estimate where the minimum value might be located. Next, we will look at precisely determining the minimum value using the NCG method in the upcoming code widget.
The following steps are involved in the optimization process:
Calculate the gradient:
Calculate the Hessian matrix:
Calculate the Newton step along with the conjugate gradient step.
Update the parameters using the NCG equation.
Luckily, we can use a built-in function minimize
in Python to optimize an objective function using the Newton-Conjugate Gradient method. Let's optimize our objective function in the following code:
import numpy as npfrom scipy.optimize import minimize# Objective function to be minimizeddef Function(theta):return theta[0]**2 + theta[1]**2 - 2 * theta[0] * theta[1] + 5# Calculate gradient of the functiondef Gradient(theta):return np.array([2*theta[0] - 2*theta[1], 2*theta[1] - 2*theta[0]])# Calculate Hessian matrixdef Hessian(theta):return np.array([[2, -2], [-2, 2]])# Apply the Newton-Conjugate methodresult = minimize(Function, [1, 1], method='Newton-CG', jac=Gradient, hess=Hessian)# Print the optimal parameters and minimum valueprint("Optimal parameters:", result.x)print("Minimum value:", result.fun)
Line 1–2: Imports numpy
and minimize
libraries.
Line 5–6: We define our objective function.
Line 9–10: We define a function to compute the gradient of the objective function.
Line 13–14: The Hessian matrix of the objective function is computed.
Line 17: This line optimizes the objective function via the Newton-Conjugate Gradient method using the minimize
function provided in the scipy
library.
Line 20–21: We print the optimal parameters and the minimum value of the function.
While the NCG method is a powerful optimization method, it has some of the following limitations:
Computational cost: The NCG method involves computing the Hessian matrix and its inverse in each iteration. Therefore, it can be computationally expensive for large-scale optimization problems.
Storage requirements: Since the Hessian matrix is dense, the memory required to store it is proportional to the number of parameters in the matrix. Storing and manipulating the matrix becomes memory-intensive for problems with large number of parameters.
Non-convexity: The disadvantage of the NCG method is that it doesn’t work well on non-convex objective functions. The NCG method might converge to the local minimum instead of the global minimum if the function is non-convex.
Strategy of line search: The NCG method typically uses a line search to find the size of the step in each iteration. The selection of a line search strategy can affect the convergence and performance of the method.