Awaiting Python

Meeshkan ML
8 min readNov 27, 2018

--

Python officially introduced async/await back in Python 3.4. However, the original asyncio developers were not part of the documentation process, leading to a plethora of awaitable questions.

Writing documentation for code you didn’t write yourself it always a tedious process. Unfortunately, more often than not, it also leads to bad documentation. Even Guido van Rossum shared his feelings about the lacking documentation for the asyncio module.

Some background

A lot has changed since the original, “provisional” API released in Python 3.4 (back in 2013!). Without going too much into details, the library was made an official part of Python in version 3.5, and was (is) continuously updated and upgraded with each new release. Unfortunately, backward compatibility means that there are also a bucket-load of misnomers. In turn, this leads to articles such as “I don’t understand Python’s Asyncio”, or “I’m too Stupid for AsyncIO”.

While not without its drawbacks, asyncio has proven as an important tool for us at Meeshkan. What troubled me, is that in our dev team, I’m probably the one with the least expertise in JavaScript. I come from a machine learning background, from low-level C and OS methodologies. Naturally, when my colleagues were talking about Futures, Promises, and awaiting them, I lowered my head and just took it as “oh well, some web talk again”.

However, if you’re anything like me and you lack general experience with the async/await pattern from languages built for asynchronicity, then follow me while I try to explain my key takeaways from asyncio.

But why

Python offers threading and multiprocessing out of the box, from way back in the day. For OS-oriented people, this sounds very attractive. While multiprocessing is often an overkill and introduces a more difficult problem of inter-process communication, threading vs asynchronicity seemed like a reasonable debate (in many languages). It’s important to note — threading in Python is not OS-level threading. It’s user-level, or in other words — there’s only ever one true Python thread running, exchanging it’s own internal “threads”. It’s further complicated by the GIL (Global Interpreter Lock; also highly recommended link for all Pythonistas), for which tl;dr — you have no control over which thread will be running in any given moment, effectively wasting time on idle threads. Abu Ashraf Masnun summarized all these perfectly in his Async Python: The Different Forms of Concurrency.

The different elements

After that long introduction, let’s get down to it. Fast, because we’re all tired here and we came here for a reason. Let’s define the following concepts:
- Event Loops: These are your drivers. They allow juggling between the different tasks.
- Coroutines: Your “async method” generators. They generate these asynchronous methods when called upon. Personally, they remind me of decorators, with a more confusing syntax.
- Tasks: Your very own asynchronous scheduler. In other words, once you have an event-loop, you’d probably want to schedule things for it to run. Once you schedule them, they become tasks.
- Future: An object whose goal it is to synchronize things again once the task is done. It’s a temporary object that will: 1) hold on to the result of the task once it’s done and 2) will report to any callback method you want, once it is done. A Task is a some kind of Future (bam, we only have to care about 3 elements instead of 4!).

Yes yes, but I need code to understand

Of course you do — we all do. I sure as hell needed it.

But there are plenty of “how to” example, exemplifying the use of different asyncio methods. Personally, I felt like those show a good idea on how to use these functions in a broad sense, but lack explaining what each individual code line actually does. Hence — let’s look into some common and important method calls.

Event loop handling functions include:
asyncio.get_event_loop() will return the event loop currently running in the thread. If no such event loop exists — it will create one for you. However, this functionality seems to only work on the main thread at the moment, so for any new threads (if you want to support multiple event loops), you will have to use …
asyncio.new_event_loop() followed by asyncio.set_event_loop() to create and set a new event loop for any non-main threads. I would carefully question the need for additional event loops, as there is only one true Python thread. Separating event loops within threads might make sense when your application starts growing out of control — it might be easier to batch off event loops and their respective tasks — but more often than not, this is not the case. The downside is that you have to communicate which event loop to use when using threading and asyncio with a single event loop. Once a loop exists, we can run it forever (until stopped with loop.stop()) with loop.run_forever(). Note, however, than run_forever() is a blocking statement.

Finally, once we’re done with our event loop, we can close it with loop.close(). We can also ping relevant information via self-explanatory methods such as loop.is_running() and loop.is_closed(). We’ll get to running stuff with our loop very soon — I promise.

Coroutines, as mentioned, are simply functions that return.. well, a function. Reminds me of decorators, more than generators. They are defined just like any normal method in Python, except with the additional async prefix, a la async def my_first_coroutine(arg1, arg2, *args, **kwargs). Notice however that much like decorators, once called, this will return a simple coroutine object, which we can use in the event loop.

What can we do with a coroutine object you may ask?
First, inside any coroutine function we may use await. await effectively blocks our function from progressing, and tells the event loop that other tasks can run meanwhile. The event loop will keep track of when this await statement is done… waiting. This is important because await cannot be used outside of coroutine objects/functions!
Next, we can finally use our event loop to run something! Using loop.run_until_complete(...). This also blocks, until the coroutine ends. It’s like using await outside of a coroutine function. However, run_until_complete is designed for tasks, and coroutine objects are implicitly converted to tasks when used this way. So let’s talk about…

Tasks. Tasks are, in my inexperienced opinion, the building blocks for asyncio. The most convenient way to create a proper task object is by 1) getting your loop event and 2) using loop.create_task(coroutine_object). This assigns the coroutine object to the event loop, schedules it to run (i.e. once a loop has been run_forever(), this is one way to assign tasks to it), and returns a task object which you can task.cancel() at any point.
Additionally, tasks offer similar methods to Future, supporting done(), result(), and most importantly — add_done_callback(callback), allowing you to initiate synchronous methods once the task is done. Your callback will be called with the result as an argument. Unlike Future, you cannot set_result(), because the result is set by the coroutine.

We’ve covered the important stuff and so far, to me, this feels like an-almost-complete-guide-for-the-inexperienced. We should cover several more nuances before we go to the wild and read asyncio examples.
Q: What if I want to call a synchronous function with asyncio?
A: You can add synchronous function calls using loop.call_* methods. These include loop.call_soon(callback, *args), loop.call_later(delay, callback, *args), loop.call_at(when, callback, *args). If you want to call a synchronous function from a different thread, use the thread-safe option: loop.call_soon_threadsafe(callback, *args).
Q: My coroutine itself has blocking call! What do I do?
A: If your coroutine has a blocking call, the event loop will also block. We have to use… you guessed it, another thread. The seams between the two (async and threads) are nigh endless. Long story short, you can schedule such a coroutine to run with a different thread with loop.run_in_executor(None, func, **args). This will return a Future object, which you can use to check done(), result() and of course cancel() and add_done_callback(callback).
With run_in_executor, you can also use await (it’s an asynchronous run after all!) to block your code from continuing, but not block your event loop from running. I have my own discontent about the way run_in_executor is formed, but then again, I guess if it ain’t broke, don’t fix it.
Q: What if I want to use keywards arguments for my functions?
A:
The documentation continuously points to functools.partial as the answer.
Q: I’d like to use asyncio but that requires overhauling all of my code!
A: That’s sometimes referred to as the async creep. Christian Medina covers it wonderfully in his article “Controlling Python Async Creep”.
Q: Great, where do I go from here?
A:
I recommend also checking out Medina’s “Threaded Asynchronous Magic and How To Wield It” for some code examples. The dev page from Python’s API on asynchronicity for some pitfalls and tips. For more complete theory and thoughts about asynchronous programming in Python, check out Nick Coghlan’s “Some Thoughts on Asynchronous Programming”.
Then have a look at Asyncio Event Loops Tutorial for some brief examples, “A guide to asynchronous programming in Python with asyncio” once you feel more competent (includes easy-to-consume examples), and finally, have a look at Dan’s Asyncio Cheatsheet.

I hope with these in mind, asyncio is perhaps a bit clearer. There are great online guides, but it seemed most of them were either directed for people with web background, or tried to encapsulate previous iterations of asyncio. Most importantly, I really just hope you’ve gained something from these, even if only some links and a few smirks from the memes.
Go on and checkout the Python asyncio API now, if you dare.

--

--

Meeshkan ML

Meeshkan Machine Learning is a machine learning company based in Helsinki, Finland. We’re hiring! https://thehub.fi/jobs/company/meeshkan