Speeding Up Python with Concurrency, Parallelism and asyncio

What are concurrency and parallelism, and how do they apply to Python?

There are many reasons your applications can be slow. Sometimes this is due to poor algorithmic design or the wrong choice of data structure. Sometimes, however, it’s due to forces outside of our control, such as hardware constraints or the quirks of networking. That’s where concurrency and parallelism fit in. They allow your programs to do multiple things at once, either at the same time or by wasting the least possible time waiting on busy tasks.

Whether you’re dealing with external web resources, reading from and writing to multiple files, or need to use a calculation-intensive function multiple times with different parameters, this post should help you maximize the efficiency and speed of your code.

First, we’ll delve into what concurrency and parallelism are and how they fit into the realm of Python using standard libraries such as threading, multiprocessing, and asyncio. The last portion of this post will compare Python’s implementation of async/await with how other languages have implemented them.

You can find all the code examples from this post in the concurrency-parallelism-and-asyncio repo on GitHub.

To work through the examples in this post, you should already know how to work with HTTP requests

Objectives

By the end of this post, you should be able to answer the following questions:

What is concurrency?
What is a thread?
What does it mean when something is non-blocking?
What is an event loop?
What’s a callback?
Why is the asyncio method always a bit faster than the threading method?
When should you use threading, and when should you use asyncio?
What is parallelism?
What’s the difference between concurrency and parallelism?
Is it possible to combine asyncio with multiprocessing?
When should you use multiprocessing vs asyncio or threading?
What’s the difference between multiprocessing, asyncio, and concurrency.futures?
How can do I test asyncio with pytest?

Concurrency

What is concurrency?

An effective definition for concurrency is “being able to perform multiple tasks at once”. This is a bit misleading though, as the tasks may or may not actually be performed at exactly the same time. Instead, a process might start, then once it’s waiting on a specific instruction to finish, switch to a new task, only to come back once it’s no longer waiting. Once one task is finished, it switches again to an unfinished task until they have all been performed. Tasks start asynchronously, get performed asynchronously, and then finish asynchronously.

If that was confusing to you, let’s instead think of an analogy: Say you want to make a BLT. First, you’ll want to throw the bacon in a pan on medium-low heat. While the bacon’s cooking, you can get out your tomatoes and lettuce and start preparing (washing and cutting) them. All the while, you continue checking on and occasionally flipping over your bacon.

At this point, you’ve started a task, and then started and completed two more in the meantime, all while you’re still waiting on the first.

Eventually you put your bread in a toaster. While it’s toasting, you continue checking on your bacon. As pieces get finished, you pull them out and place them on a plate. Once your bread is done toasting, you apply to it your sandwich spread of choice, and then you can start layering on your tomatoes, lettuce, and then, once it’s done cooking, your bacon. Only once everything is cooked, prepared, and layered can you place the last piece of toast onto your sandwich, slice it (optional), and eat it.

Because it requires you to perform multiple tasks at the same time, making a BLT is inherently a concurrent process, even if you are not giving your full attention to each of those tasks all at once. For all intents and purposes, for the next section, we will refer to this form of concurrency as just “concurrency.” We’ll differentiate it later on in this post.

For this reason, concurrency is great for I/O-intensive processes – tasks that involve waiting on web requests or file read/write operations.

In Python, there are a few different ways to achieve concurrency. The first we’ll take a look at is the threading library.

For our examples in this section, we’re going to build a small Python program that grabs a random music genre from Binary Jazz’s Genrenator API 5 times, prints the genre to the screen, and puts each one into its own file.

To work with threading in Python, the only import you’ll need is threading, but for this example, I’ve also imported urllib to work with HTTP requests, time to determine how long the functions take to complete, and json to easily convert the json data returned from the Genrenator API.

Let’s start with a simple function:

def write_genre(file_name):
    """
    Uses genrenator from binaryjazz.us to write a random genre to the
    name of the given file
    """

    req = Request("https://binaryjazz.us/wp-json/genrenator/v1/genre/", headers={'User-Agent': 'Mozilla/5.0'})
    genre = json.load(urlopen(req))

    with open(file_name, "w") as new_file:
        print(f'Writing "{genre}" to "{file_name}"...')
        new_file.write(genre)

Examining the code above, we’re making a request to the Genrenator API, loading its JSON response (a random music genre), printing it, then writing it to a file.

Without the “User-Agent” header you will receive a 304.

What we’re really interested in is the next section, where the actual threading happens:

threads = []

for i in range(5):
    thread = threading.Thread(
        target=write_genre,
        args=[f"./threading/new_file{i}.txt"]
    )
    thread.start()
    threads.append(thread)

for thread in threads:
    thread.join()

We first start with a list. We then proceed to iterate 5 times, creating a new thread each time. Next, we start each thread, append it to our “threads” list, and then iterate over our list one last time to join each thread.

Explanation: Creating threads in Python is easy.

To create a new thread, use threading.Thread(). You can pass into it the kwarg (keyword argument) target with a value of whatever function you would like to run on that thread. But only pass in the name of the function, not its value (meaning, for our purposes, write_genre and not write_genre()). To pass arguments, pass in “kwargs” (which takes a dict of your kwargs) or “args” (which takes an iterable containing your args – in this case, a list).

Creating a thread is not the same as starting a thread, however. To start your thread, use {the name of your thread}.start(). Starting a thread means “starting its execution.”

Lastly, when we join threads with thread.join(), all we’re doing is ensuring the thread has finished before continuing on with our code.

Threads

But what exactly is a thread?

A thread is a way of allowing your computer to break up a single process/program into many lightweight pieces that execute in parallel. Somewhat confusingly, Python’s standard implementation of threading limits threads to only being able to execute one at a time due to something called the Global Interpreter Lock (GIL). The GIL is necessary because CPython’s (Python’s default implementation) memory management is not thread-safe. Because of this limitation, threading in Python is concurrent, but not parallel. To get around this, Python has a separate multiprocessing module not limited by the GIL that spins up separate processes, enabling parallel execution of your code. Using the multiprocessing module is nearly identical to using the threading module.

More info about Python’s GIL and thread safety can be found on Real Python and Python’s official docs.

We’ll take a more in-depth look at multiprocessing in Python shortly.

Before we show the potential speed improvement over non-threaded code, I took the liberty of also creating a non-threaded version of the same program (again, available on GitHub). Instead of creating a new thread and joining each one, it instead calls write_genre in a for loop that iterates 5 times.

To compare speed benchmarks, I also imported the time library to time the execution of our scripts:

Starting...
Writing "binary indoremix" to "./sync/new_file0.txt"...
Writing "slavic aggro polka fusion" to "./sync/new_file1.txt"...
Writing "israeli new wave" to "./sync/new_file2.txt"...
Writing "byzantine motown" to "./sync/new_file3.txt"...
Writing "dutch hate industrialtune" to "./sync/new_file4.txt"...
Time to complete synchronous read/writes: 1.42 seconds

Upon running the script, we see that it takes my computer around 1.49 seconds (along with classic music genres such as “dutch hate industrialtune”). Not too bad.

Now let’s run the version we just built that uses threading:

Starting...
Writing "college k-dubstep" to "./threading/new_file2.txt"...
Writing "swiss dirt" to "./threading/new_file0.txt"...
Writing "bop idol alternative" to "./threading/new_file4.txt"...
Writing "ethertrio" to "./threading/new_file1.txt"...
Writing "beach aust shanty français" to "./threading/new_file3.txt"...
Time to complete threading read/writes: 0.77 seconds

The first thing that might stand out to you is the functions not being completed in order: 2 - 0 - 4 - 1 - 3

This is because of the asynchronous nature of threading: as one function waits, another one begins, and so on. Because we’re able to continue performing tasks while we’re waiting on others to finish (either due to networking or file I/O operations), you may also have noticed that we cut our time roughly in half: 0.76 seconds. Whereas this might not seem like a lot now, it’s easy to imagine the very real case of building a web application that needs to write much more data to a file or interact with much more complex web services.

So, if threading is so great, why don’t we end the post here?

Because there are even better ways to perform tasks concurrently

#python

Objectives

Concurrency

Threads

testdriven.io

Speeding Up Python with Concurrency, Parallelism and asyncio