...

/

Introduction

The goal of optimization is to find the point that minimizes a function. We’ve seen that that point can exist or not. But how can we know it does exist? Is there a method to find it?

We worked with the plot of the function. Although it’s of great help to see what a function looks like, determining the exact point that minimizes the function by just looking at a plot can be impossible most of the time.

For example, look at the plot we saw in the problem of the soda can. In that problem, we had to find the smallest surface of a can that could contain a specific volume of soda.

We can’t determine the exact minimum of the function from the plot. If we hadn’t added the restriction $r \ge 5$ , what would’ve been the minimum of the function? Is it at $x = 3.3$ or at $x = 3.95$ ? Additionally, we can only see a function’s plot in a certain interval. We can’t see the plot in all the infinite extensions of the x and y axes. The functions we’ve seen so far have been simple enough to give us an idea of what happens outside the interval we’re seeing. But what about a function with logarithms, exponentiation, and trigonometric operators? How can we be sure about the behavior of such a function outside the boundaries we’re seeing? Sometimes we’ll want to find the minimum of functions with more than one variable that can’t even be plotted. How can we deal with them?

This is why we need to approach optimization from a different perspective. Luckily, we have some tools to help us to solve optimization problems more robustly and accurately. One of those tools is calculus, specifically differential calculus.

Differential calculus is about derivatives and gradients. These are properties of functions that provide a lot of information without the need to plot those functions. In this lesson, we’ll introduce derivatives, and we’re going to apply them to analytically solve some of the problems we’ve seen so far.

Note: We assume learners know how to calculate basic derivatives. But it’s not a requirement to follow the course since we’ll make Python calculate those derivatives for us.

What’s a derivative?

When we plot a function, we’re drawing a line that obeys the formula of the function in a two-dimensional plane. When we look at that drawing from left to right, we can see the function going up and down according to its formula. But is there a way to predict when the function goes down and up without looking at its plot?

The answer to the last question is YES! We can know this behavior of a function directly from its formula—and we can do it by determining its derivative.

The derivative of a function can be interpreted as the rate at which the function changes when $x$ changes. It also can be seen as the slope of the tangent to the function at a certain point. The figure below illustrates the meaning of a derivative.

While two points in a function get closer, the line between them becomes an approximation of the tangent to the function at that point where both converge. Note how the line that connects the two points in the first plot of the previous illustration changes while the points get closer. In the third plot, both points are the same; they’re so close that they became a single point, and the line that connects them is a tangent to that single point.

The slope of a line determines the inclination of that line. The bigger the absolute value of the slope, the bigger the angle of inclination of the line. Also, the slope sign determines whether the line goes up (positive slope) or down (negative slope).

The rate of change of a function $f(x)$ with respect to $x$ can be written as:

f'(x) = \frac{f(x) - f(x_0)}{x - x_0}

Here, $x_0$ is an arbitrary value in which $f$ is defined. Note that $f'$ is also a function of $x$ .

As we did in the previous illustration, we can make $x$ be so near to $x_0$ that they’re almost the same point. This can be formulated with the concept of a limit in calculus:

f'(x) = \lim_{x \to x_0} \frac{f(x) - f(x_0)}{x - x_0}

Those who aren’t familiar with these formulas shouldn’t worry. The point here is that the derivative of a function is another function, and we can denote it as $f'$ or as $\frac{\partial f} {\partial x}$ .

This new function called derivative, when evaluated on a specific value of $x$ , gives us the value of the slope of the tangent line to $f$ in that specific $x$ . See the following figure.

Introduction

Derivatives and Gradients

First Optimization Algorithms

Population Methods

Adding Constraints

Solving Sudoku and the 8-Queens Puzzle as Constraint Satisfaction

Linear Constrained Optimization

Summary and Conclusion

Appendix

Maze Solver Using the Ant Colony Optimization Algorithm

Introduction

What’s a derivative?