What is the scipy.optimize.minimize() function?
The scipy library allows us to find the minimum value of an objective function—a real-valued function that is to be minimized or maximized—using the scipy.optimize.minimize() function. But why do we need to find the minimum value in the first place? Determining the minimum value is a common problem in machine learning and data science. For example, we might want to cut down costs. In such a case, we’ll determine an equation that outputs costs, and to minimize it, we just need to fetch the minimum value of this equation.
Getting started
Let’s look at how to make the relevant imports to start using the minimize() function.
Imports
To begin using the scipy.optimize.minimize() function, we need to install Python and its libraries numpy and scipy. Then, we need to import the scipy.optimize module. There are two ways of importing the minimize() function. We can directly import it from scipy.optimize or reference it from optimize.
from scipy.optimize import minimizeimport scipy.optimize as optimize
The objective function to be minimized
Next, we select a function to minimize. For this answer, we’ll work on minimizing a quadratic function.
Applying the minimize() function
Let’s go over the various arguments that can be passed to the minimize() function:
We pass the objective function to be minimized as the first argument. The objective function’s parameters will be a 1-D array.
The second argument is the initial guess, an ndarray of shape
(n,)wherenis the count of independent variables. An initial guess is required for the function to begin exploring the function space until it converges to the actual minimum.The
methodargument specifies the type of solver required to determine the minimization problem. For this answer, we’ll play around with seven different methods to fetch our minimum value.CG: This is the conjugate gradient method to determine the minimum value of the objective function.BFGS: TheBFGSuses the second-order derivative of the objective function to find the minimum.Newton-CG: The Newton conjugate gradient method is the best of both worlds. It combines the Newton method’s ability to determine a function’s lowest value and the conjugate gradient’s method to reach a function’s lowest optimally.L-BFGS-B: TheL-BFGS-Balgorithm, an extension of theBFGSmethod, also relies on the second-order derivative, but it saves memory by only saving a few vectors, therefore the name limited memory BFGS.TNC: The truncated Newton conjugate gradient algorithm is implemented in this case. It calculates a portion of the Hessian matrix, which is less computationally expensive.COBYLA: This method uses a linear approximation of both the objective and constraint functions.SLSQP: Sequential least squares programming (SLSQP) minimizes a multivariate function without any constraints, but we’ll use a single variable function in our example.
There are many optional arguments, and one of them is the
optionsdictionary.disp: Truemeans all convergence messages will be printed. We can also provide the maximum number of iterations using themaxiterkeyword.
scipy.optimize.minimize(objective_function, initial_guess,method """optional""", options={"disp":True} """optional""")
The minimize() function in action
Here’s a Jupyter Notebook implementing all the methods discussed above. You’re encouraged to tweak the code by providing some optional arguments and rerun the cell.
from scipy.optimize import minimize
import scipy.optimize as optimize
def educative_quadratic_function(x):
educatives_return_value = (2*x**2) - (3*x) + 4
return educatives_return_value
starting_ex_value = -1
function_name = ['CG','BFGS','Newton-CG', 'L-BFGS-B','TNC','COBYLA','SLSQP']
educatives_result = optimize.minimize(educative_quadratic_function, starting_ex_value,options={"disp":True})
if educatives_result.success:
print(f"value of x = {educatives_result.x} value of y = {educatives_result.fun}")
else:
print("No minimum value")
educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[0], options={"disp" : True})
print(educatives_lowest)
educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[1], options={"disp" : True})
print(educatives_lowest)
def gradient(x):
gradient = (4*x) -3
return gradient
educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[2], jac=gradient, options={"disp":True})
print(educatives_lowest)
def get_my_scalar(x):
scalar_value =educative_quadratic_function(x[0])
return scalar_value
educatives_lowest = minimize(get_my_scalar,starting_ex_value, method=function_name[3])
print(educatives_lowest)
educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[4], options={"disp":True})
print(educatives_lowest)
educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[5], options={"disp":True})
print(educatives_lowest)
educatives_lowest = minimize(educative_quadratic_function,starting_ex_value, method=function_name[6],options={"disp":True})
print(educatives_lowest)Code explanation
Lines 1–2: We import the
minimizefunction from thescipy.optimizemodule. We use the aliasoptimizeforscipy.optimzeso that it’s shorter to write throughout the code.Lines 4–6: We declare a quadratic function
2*x**2 - 3*x + 4that returns the corresponding value offor a given value of . Lines 8–14: We set the starting value of
, called starting_ex_valueas-1. We define a list calledfunction_namecontaining all the function names that we’ll pass as arguments to theminimize()function. These functions have been discussed in the preceding section. The first time we call theoptimize.minimize()function on line 10, we pass it the objective function’s name,educative_quadratic_function, the initial value of x,starting_ex_value, and a dictionary of optional argumentsoptions={"disp":True}. This would allow all convergence messages to be displayed. Theminimize()function returns three values: a solution array, a boolean flag calledsucess, andmessagewhich contains the reason for termination. The return valuesuccessindicates if the optimizer returned successfully. Ifsuccessis equal toTrue, we display the values ofand for the point of convergence. Otherwise, we display No minimum value.Lines 16–20: Next, we call the
optimize()function twice, passing it an additional keyword argumentmethod=function_name[0]. Thefunction_namearray, as mentioned before, contains the names of all the functions to be passed as arguments. First, we’ll pass theCGfunction and then theBFGSfunctions as arguments.Lines 22–26: We define a function that calculates the gradient of our quadratic function for any value of
. We require the gradient to pass as a value to the keyword argument jac, short for Jacobian, on line 25.Lines 28–30: We also define a function
get_my_scalar(), which returns a scalar value for the corresponding value for. Lines 32–42: We pass the starting value of
, and L-BFGS-Basmethodto theminimize()function. Lastly, we’ll call theminimize()function for the remaining methods, namely,TNC,COBYLA, andSLSQP.
Refer to the following answers for further motivation:
Free Resources