Time Related Tips
Discover how to accurately measure code execution time with Python's timeit function and apply performance tips such as checkpointing intermediate results, optimizing data lookups using sets and dictionaries, minimizing costly function calls, and effectively managing large integers. This lesson helps you write faster and more maintainable Python programs.
Time it
A popular saying among programmers is “Make it right, then make it fast.” Assuming that our program is correct and that we have several solutions, we want the fastest option. This is often the case in Python, despite the principle from The Zen of Python that says: “There should be one—and preferably only one— obvious way to do it.”
An incorrect way to time our code is to call the time() function from the namesake module before and after the code fragment of interest. The accuracy of the function is low, usually well below the fragment execution time. The right tool for the job is the function, timeit(),in the timeit module. The function takes some parameters, the first two being the most important.
- The first parameter is the code fragment that we want to time. We must provide it to the function as a string. Otherwise, Python will execute it before calling the timing function. The code must include the definitions of all involved variables because the function evaluates the parameters in a clean environment that doesn’t include any previously defined functions and variables or previously imported modules. The function evaluates the code number times, where the
numberis an optional parameter, and returns the average execution time. - The second parameter is another string that contains the initialization code. The
timeit()function executes this code once before it starts collecting statistics. We can use this parameter to import modules or define global variables.
We may wonder what the best way to square a number is. Do we multiply it by itself, use the exponential operator (**), or call math.pow()?In cases like this, we should turn to The Zen of Python, which says “In the face of ambiguity, refuse the temptation to guess." Use the Timeit()!
Note: If we apply the first two code fragments to an integer, the results will be quite different. For more details, see the Beware of Large and Slow Ints section at the end of this lesson. The function returns the total execution time in seconds.
The Timeit() function executes the statement number=1,000,000 times. Divide the returned time by number to get the time-per-statement execution.
Let’s run the below code.
While having all seventeen digits after the decimal point, multiplication still looks notably faster, closely followed by the exponential operator. math.pow() is the worst.
Multiplying a number by itself is faster than calling
math.pow().
Checkpoint saves time
There’s nothing more frustrating than irrecoverably removing a semi-finished file within a Ph.D. dissertation or our program crashing after several hours (or days) of heavy-duty computations. Frequent backups help with the former. Checkpointing helps with the latter.
Checkpointing is a form of real-time backup and can be easily implemented in Python via pickling.
To create a checkpoint, save the values of the variables that were difficult to obtain in files so that, if the program crashes, we could restore values from files instead of recomputing them.
For example, our program may call the fictitious and very time-consuming functions, foo() and bar():
result1 = foo(original_data)
result2 = bar(result1)
We may want to add one or two checkpoints, if we want to treat the final result in the same way as the intermediate results. We should create a directory for the checkpoint data. Check if the results already exist before recalculating them. If a checkpoint concerns multiple results, store them as a tuple or one after another in the same file. pickle supports several objects per file.
try:
with open('checkpoints/result1.p', 'rb') as infile:
result1 = pickle.load(infile)
except FileNotFoundException:
result1 = foo(original_data)
with open('checkpoints/result1.p', 'wb') as outfile:
pickle.dump(outfile, result1)
try:
with open('checkpoints/result2.p', 'rb') as infile:
result2 = pickle.load(infile)
except FileNotFoundException:
result2 = bar(result1)
with open('checkpoints/result2.p', 'wb') as outfile:
pickle.dump(outfile, result2)
This solution extends to any number of intermediate stages and can easily save lots of time and effort.
This tip is quite similar to CacheIt. As mentioned, checkpointing is a form of caching.
Optimize lookups
Some say (and introductory computer science books agree) that searching through data is one of the most frequent operations—if not the most frequent operation—in computing. If we need an engine for something, that something must be important.
Use of in operator
The Python way to search for an item in some data set (typically in a collection, such as a list, a tuple, a set, or a dictionary) is to use the in operator. The operator looks up an item and checks if it’s in the collection either as a value or as an object. We shouldn’t worry about a single lookup in a small collection. As the collection size and the number of lookups grow, however, performance may become an issue.
List lookups are the slowest. A list is a sequential data structure. To check if a metaphorical needle of value 1001,is in the list, haystack, Python must scan through the list and compare the needle with each list item. In the worst case, it may need to scan the whole list. The longer the list, the longer it takes to scan through. On an average computer, checking if a nonexistent needle is in the haystack of 1,000 integer numbers takes around 9.1 microseconds:
Let’s try it here.
Tuples are not any better, either. They are immutable but sequential, as well. Apparently, mutability doesn’t matter:
Dictionaries and sets offer a significant improvement. For a collection of distinct 1,000 numbers, they outperform by a factor of 450–500, and the hastening increases with the collection size:
Hash tables
How is this possible? Python dictionaries and sets are implemented as hash tables. Hash tables have an almost constant lookup time. They consist of bins (buckets) arranged in an array. The buckets’ locations are calculated from the needle’s value by calling a hash function. Both the function call and the array lookup take constant time. If the needle is present in the set, then it must be stored in the specific bucket. If it’s not in that bucket, it’s not in the set. Simple and fast.
The choice of the appropriate collection type may become a choice between a feasible program and a program that doesn’t practically terminate. On a practical note, if we’re given a list of values (say, a list of stopwords for a natural language-processing job), and we plan to search the list in a loop, we should invest in converting the list into a set and then search the set:
This is also true for the not in operator.
Avoid function calls, they’re costly
It takes time to call a function even if it doesn’t have parameters. On an average computer, a call to a function without parameters takes about 55 nanoseconds. For example, a function that takes no parameter and calculates the sum of two numbers.
def sumOfTwo():
a = 4
b = 5
sum = a + b
return sum
Let’s calculate the execution time for the above function call.
On top of that, passing each parameter costs another 3 nanoseconds. While minor, it adds up, especially when we call a function in a loop.
Now, let’s pass parameters to the function above and calculate the execution time difference.
This increasing delay means that we should not create and call functions without a need.
- An arithmetic addition expression is faster (and better) than a function,
add(a,b), that adds two numbers. - A function with ten parameters is somewhat slower than the same function that takes one tuple of ten items.
- Multiplying a number by itself is faster than calling
math.pow().
How do we know when to avoid a function call?
- Are there any ways to solve the problem without that call?
- Is the functionless solution maintainable and readable? If the answer is yes, avoid the call.
- If the functionless solution is much faster, avoid the call.
Otherwise, call the function. Because our job is to do it right and then make it fast.
Build, then print
Function calls are costly in terms of execution time, but few function calls are as expensive as calls to print() and other input/output functions. Displaying visual output involves interacting with the operating system (slow on its own) and scrolling the console window (also slow). Compare the following two code fragments that accomplish precisely the same goal: let’s display 1,000 letter a’s on one line.
Let’s build the letters first and then print it.
The first fragment addresses the problem literally. It displays the letter 1,000 times without going to the next line, then finally breaking the line. The second fragment first builds a string that contains 1,000 a’s and then displays them at once.
The second fragment takes less time to execute on an average computer.
Python string functions and operators are very efficient. If performance is important, we should construct as much output as possible as one string and display it with as few calls to print() as possible.
Waste space, save time
Python includes a strong collection of built-in data types and data structures. More data types and data structures come with the standard library. There seems to be a suitable container for any occasion.
There are cases when two containers work better (faster) together than one alone, though.
Consider a situation when we must remove duplicates from a list but preserve the list order (and keep only the first occurrence of each element). The naive solution to convert the list to a set and then back to a list fails because sets are unordered.
In the CPython implementation of Python 3.6, 3.7, and above, the dictionary order is guaranteed to be an insertion order. We can construct a dictionary using the original list items like keys and some constant, like 1, as the value. If the list has duplicate keys, only the first is preserved:
What if we use an earlier version of Python? Checking if the next item is already on the list is time-consuming, as lists have notoriously slow lookup time. We can combine a set for tracking the duplicates and a list for collecting the unique values::
Two parallel data structures require more space but provide superior performance. It’s best to simply install a newer version of Python
Beware of large and slow ints
Unlike C, C++, Java, and other popular programming languages, Python supports unbounded (very large and very small) integer numbers. The numbers smaller than or equal to sys.maxsize (9,223,372,036,854,775,807 on a 64-bit CPU) are represented natively as CPU words. Larger numbers use a more complicated implementation.
Let’s check the sys.maxsize
In Python, we have support for even larger numbers than sys.maxsize as well. Because of this feature, we can perform arbitrary-precision computations with integer numbers. For example, we can have a googol, which is a one followed by 100 zeros:
Let’s run the below code.
Let’s add one to it and then check:
Unbounded integer numbers aren’t free, because all integer arithmetic operations produce precise integer results, but floating-point operations are approximate. The evaluation of an integer expression may take much more time but provides improved accuracy.
Let’s compare the two operations:
We typically don’t need accurate large results. We should stick to the floating-point numbers and enjoy the speedy execution.
Quiz
Which data structure works as a hash table in Python? Multi-select
sets
lists
arrays
dictionaries