Search⌘ K
AI Features

Using Threads

Explore how to implement threads in Python to run functions concurrently across CPUs. Understand thread management, including daemon threads and join methods, while recognizing performance constraints imposed by Python's Global Interpreter Lock. Gain practical insights into running multiple threads for parallel tasks, improving application responsiveness in CPU scaling contexts.

What are threads?

Threads in Python are a good way to run a function concurrently with other functions. If your system does not support multiple processors, the threads will be executed one after another as scheduled by the operating system. However, if multiple CPUs are available, threads could be scheduled on multiple processing units, once again as determined by the operating system.

By default, there is only one thread, the main thread, and it is the thread that runs your Python application. To start another thread, Python provides the threading module.

To run the following code snippet, press the Run button and enter the command python2 threading-start.py.

To change the source code in the playground and run, press Run again after changing the code, then wait for the four step process of container creation to complete. Then click on the terminal to go back and issue the command to run the program again.

import threading

def print_something(something):
    print(something)

t = threading.Thread(target=print_something, args=("hello",))
t.start()
print("thread started")
t.join()
Starting a new thread

If you run the above example multiple times, you will notice that the output might be different each time. On my laptop, doing this gives the following:

Markdown
$ python2 threading-start.py
hellothread started
$ python2 threading-start.py
hello
thread started
$ python2 threading-start.py
hello
thread started

If you specifically expected any one of the outputs each time, then you forgot that there is no guarantee regarding the order of execution for the threads.

Once started, the threads join: the main thread waits for the second thread to complete by calling its join method. Using join is handy in terms of not leaving any threads behind.

If you do not join all your threads and wait for them to finish, it is possible that the main thread finishes and exits before the other threads. If this happens, your program will appear to be blocked and will not respond to even a simple KeyboardInterrupt signal.

Note: We have demonstrated this effect using Python 2. However, in Python 3, the threading effect is not easily observable using small-scale examples.

Threads as daemons

To avoid this, and because your program might not be in a position to wait for the threads, you can configure threads as daemons. When a thread is a daemon, it is considered as a background thread by Python and is terminated as soon as the main thread exits.

Python 3.5
import threading
def print_something(something):
print(something)
t = threading.Thread(target=print_something, args=("hello",))
t.daemon = True
t.start()
print("thread started")

In the above example, there is no longer a need to use the join method since the thread is set to be a daemon.

Running multiple threads

The program below is a simple example, which sums one million random integers eight times, spread across eight threads at the same time.

To run the following code snippet, press the Run button and wait until all the commands finish running. Then, enter the command time python multithreading-worker.py.

import random
import threading

results = []

def compute():
    results.append(sum(
        [random.randint(1, 100) for i in range(1000000)]))

workers = [threading.Thread(target=compute) for x in range(8)]
for worker in workers:
    worker.start()
for worker in workers:
    worker.join()
print("Results: %s" % results)
Starting a new thread in daemon mode

Running the above example with time command returns the result and stats about execution time and CPU usage.

The program ran on an idle dual cores CPU, which means that Python could have used up to 200% CPU power. However, it was unable to do that, even with eight threads running in parallel, it stuck at just above 100%, which is just above 50% of the hardware’s capabilities.

The bottleneck

The following figure illustrates that bottleneck: to access all of the system’s CPU, you need to go through CPython’s GIL.

Again, as discussed in the previous chapter, the GIL limits the performance of CPython when executing multiple threads. Threads are therefore useful when doing parallel computing or input/output on slow networks or files. Those tasks can run in parallel without blocking the main thread.