A Microbenchmark Example
Discover the benchmarking results of linear and binary search algorithms.
We'll cover the following...
Benchmarking linear and binary search algorithms
We will wrap this chapter up by going back to our initial examples with linear search and binary search and demonstrate how they can be benchmarked using a benchmarking framework.
We began this chapter by comparing two ways of searching for an integer in a std::vector. If we knew that the vector was already sorted, we could use a binary search, which outperformed the simple linear search algorithm. We will not repeat the definition of the functions here, but the declaration looked like this:
The difference in the execution time of these functions is very obvious once the input is sufficiently large, but it will serve as a good enough example for our purpose. We will begin by only measuring linear_search(). Then, when we have a working benchmark in place, we will add binary_search() and compare the two versions.
In order to make a testing program, we first need a way to generate a sorted vector of integers. A simple implementation, as follows, will be sufficient for our needs:
The vector that is returned will contain all integers between 0 and n - 1. Once we have that in place, we can create a naive test program like this:
We are searching for the value n, which we know isn't in the vector, so the algorithm will exhibit its worst-case performance using this test data. That's the good part of this test. Other than that, it is afflicted with many flaws that will make this benchmark useless:
Compiling this code using optimizations will most likely completely remove the code because the compiler can see that the results from the functions are not being used.
We don't want to measure the time it takes to create and fill the
std::vector.By only running the
linear_search()function once, we will not achieve a statistically stable result.It's cumbersome to test for different input sizes.
Let's see how these problems can be addressed by using a microbenchmarking support library. There are various tools/libraries for benchmarking, but we will use Google Benchmark.
Microbenchmark of linear_search()
Here is how a simple microbenchmark of linear_search() might look when using Google Benchmark:
Note: Benchmark values can be different every time we run the program.
That's it! The only thing we haven't addressed yet is the fact that the input size is hardcoded to 1024. We will fix that in a while. Compiling and running this program will generate something like this:
The number of iterations reported in the rightmost column reports the number of times the loop needed to execute before a statistically stable result was achieved. The state object passed to our benchmarking function determines when to stop. The average time per iteration is reported in two columns: Time is the wall-clock time, and CPU is the time spent on the CPU by the main thread. In this case, they were the same, but if linear_search() had been blocked waiting for I/O (for example), the CPU time would have been lower than the wall-clock time.
Another important thing to note is that the code that generates the vector is not included in the reported time. The only code that is being measured is the code inside this loop:
The boolean value returned from our search functions is wrapped inside benchmark::DoNotOptimize(). This is the mechanism used to ensure that the returned value is not optimized away, which could make the entire call to linear_search() disappear.
Now let's make this benchmark a little more interesting by varying the input size. We can do that by passing arguments to our benchmarking function using the state object. Here is how to do it:
This will start with an input size of 64 and double the size until it reaches 256. On our machine, the test generated the following output:
As a final example, we will benchmark both the linear_search() and the binary_search() functions using a variable input size and also trying to let the framework estimate the time complexity of our functions. This can be done by providing the input size to the state object using the SetComplexityN() function. The complete microbenchmark example looks like this:
When executing the benchmark, the performance results may vary on each run due to a number of factors, such as system load, memory allocation, and other concurrent processes.
The output is aligned with our initial results in this category, where we concluded that the algorithms exhibit linear runtime and logarithmic runtime, respectively. If we plot the values in a table, we can clearly see the linear and logarithmic growth rates of the functions.
The following figure was generated using Python with Matplotlib:
We now have a lot of tools and insights for finding and improving the performance of our code. We cannot stress enough the importance of measuring and setting goals when working with performance.