Introduction
Explore the concept of code vectorization and how to speed up Python programs using NumPy. Understand practical examples of summing lists with NumPy's vectorized functions, and see how this approach outperforms pure Python methods in speed and adaptability.
We'll cover the following...
What is Code vectorization?
Code vectorization means that the problem you’re trying to solve is inherently vectorizable and only requires a few NumPy tricks to make it faster. Of course, it does not mean it is easy or straightforward, but at least it does not necessitate totally rethinking your problem (as it will be the case in the Problem vectorization chapter). Still, it may require some experience to see where code can be vectorized.
Example
Let’s illustrate this through a simple example where we want to sum up two lists of integers.
Solution 1: Pure Python
One simple way using pure Python is:
Solution 2: Use np.add(list1,list2)
This first naive solution can be vectorized very easily using NumPy:
Compare time of two approaches
Without any surprise, benchmarking the two approaches shows that the second method is the fastest with one order of magnitude.
Not only is the second approach faster, but it also naturally adapts to the
shape of Z1 and Z2. This is the reason why we did not write Z1 + Z2
because it would not work if Z1 and Z2 were both lists. In the first Python
method, the inner + is interpreted differently depending on the nature of the
two objects such that if we consider two nested lists, we get the following
outputs:
Summing up, the first method concatenates the two lists together, the second method concatenates the internal lists together and the last one computes what is (numerically) expected.
Solve this Quiz!
Which of the following is a good approach when designing a solution to a problem?
Think of a brute force Python solution
Think of vectorization using NumPy tricks
Now that you have learned code vectorization, let’s move on to an exercise in the next lesson.