Introduction

Get introduced to derivatives and their importance for optimization.

We'll cover the following

The goal of optimization is to find the point that minimizes a function. We’ve seen that that point can exist or not. But how can we know it does exist? Is there a method to find it?

We worked with the plot of the function. Although it’s of great help to see what a function looks like, determining the exact point that minimizes the function by just looking at a plot can be impossible most of the time.

For example, look at the plot we saw in the problem of the soda can. In that problem, we had to find the smallest surface of a can that could contain a specific volume of soda.

We can’t determine the exact minimum of the function from the plot. If we hadn’t added the restriction r5r \ge 5, what would’ve been the minimum of the function? Is it at x=3.3x = 3.3 or at x=3.95x = 3.95? Additionally, we can only see a function’s plot in a certain interval. We can’t see the plot in all the infinite extensions of the x and y axes. The functions we’ve seen so far have been simple enough to give us an idea of what happens outside the interval we’re seeing. But what about a function with logarithms, exponentiation, and trigonometric operators? How can we be sure about the behavior of such a function outside the boundaries we’re seeing? Sometimes we’ll want to find the minimum of functions with more than one variable that can’t even be plotted. How can we deal with them?

This is why we need to approach optimization from a different perspective. Luckily, we have some tools to help us to solve optimization problems more robustly and accurately. One of those tools is calculus, specifically differential calculus.

Differential calculus is about derivatives and gradients. These are properties of functions that provide a lot of information without the need to plot those functions. In this lesson, we’ll introduce derivatives, and we’re going to apply them to analytically solve some of the problems we’ve seen so far.

Note: We assume learners know how to calculate basic derivatives. But it’s not a requirement to follow the course since we’ll make Python calculate those derivatives for us.

What’s a derivative?

When we plot a function, we’re drawing a line that obeys the formula of the function in a two-dimensional plane. When we look at that drawing from left to right, we can see the function going up and down according to its formula. But is there a way to predict when the function goes down and up without looking at its plot?

The answer to the last question is YES! We can know this behavior of a function directly from its formula—and we can do it by determining its derivative.

The derivative of a function can be interpreted as the rate at which the function changes when xx changes. It also can be seen as the slope of the tangent to the function at a certain point. The figure below illustrates the meaning of a derivative.

While two points in a function get closer, the line between them becomes an approximation of the tangent to the function at that point where both converge. Note how the line that connects the two points in the first plot of the previous illustration changes while the points get closer. In the third plot, both points are the same; they’re so close that they became a single point, and the line that connects them is a tangent to that single point.

The slope of a line determines the inclination of that line. The bigger the absolute value of the slope, the bigger the angle of inclination of the line. Also, the slope sign determines whether the line goes up (positive slope) or down (negative slope).

The rate of change of a function f(x)f(x) with respect to xx can be written as:

f(x)=f(x)f(x0)xx0f'(x) = \frac{f(x) - f(x_0)}{x - x_0}

Here, x0x_0 is an arbitrary value in which ff is defined. Note that ff' is also a function of xx.

As we did in the previous illustration, we can make xx be so near to x0x_0 that they’re almost the same point. This can be formulated with the concept of a limit in calculus:

f(x)=limxx0f(x)f(x0)xx0f'(x) = \lim_{x \to x_0} \frac{f(x) - f(x_0)}{x - x_0}

Those who aren’t familiar with these formulas shouldn’t worry. The point here is that the derivative of a function is another function, and we can denote it as ff' or as fx\frac{\partial f} {\partial x}.

This new function called derivative, when evaluated on a specific value of xx, gives us the value of the slope of the tangent line to ff in that specific xx. See the following figure.

The derivative is the slope of the black and green lines at those tangent points. If the derivative is negative, the function will be decreasing (going down) around that point; if the derivative is positive, the function will be increasing (going up) around that point.

But how can this be of any help when solving optimization problems?

In the following lesson, we’ll explore the applications of derivatives in optimization.