A Bestiary of Python’s Asyncio
This blog post is sort of meant to be my thoughts/notes on Asyncio, a library I’d like to understand in Python 3.5 and later. I’m writing this at a pretty novice level, because it helps to understand what abstractions asynchronous programming takes advantage of. I encourage you to read more about operating systems or syscalls if you haven’t considered their relation to the python interpreter, even if you never end up using them in practice.
In my rough understanding, asyncio is a library built in to Python that adds asynchronous concurrency, so that one thread can better juggle several tasks.
In any modern operating system, a network request or IO request is abstracted away by parts of the OS. So if you tell Python, “Hey, I want to get this data from this address at this port using this protocol,” Python calls some wrapper around your operating system’s system call (if you don’t know system calls, they’re like functions programs can ask the OS to do, and it includes stuff like reading from the filesystem, getting the time, or generating a secure random number). This is why Python’s socket API and file API look extremely similar to the APIs in C. Python is not in charge of turning your URL requests into ethernet protocol frames, or figuring out how to turn a file modification in a certain directory into a bunch of writes on a magnetic disk.
The idea behind asynchronous programming is to teach a single thread, which reads one line of code at a time, to go ahead and do something else when a slow operation starts. Just waiting is called blocking, and while it can be fine in some contexts (like if you want to download a few web pages once and parse their text), if your application is IO heavy, you leave your interpreter, and thus CPU, sitting around bored. It would be better if your function could generously say, “Hey, I’m going to be a while, let somebody else take a turn.” Then, the interpretter can look at what else needs attention and keep your application quick and responsive.
That’s what coroutines allow. A coroutine function is declared like so:
To a computer, a minute is a lifetime. If this was a function in a chatbot, imagine how many requests would have been lost if we used time.sleep().
But because we used await, the event loop knows to put this function down and come back to it until 60 seconds have passed. During that time, the event loop can look for something else to do.
Coroutines are pretty similar to generators, as they’re functions that do something, and then yield control and sometimes a partial or current result. The older syntax for coroutines actually was “yield from” before they decided await was more intuitive.
Futures left me a big confused, since the future is a term that has a lot of meanings in social usage. But it helps a bit to know futures are called “promises” in other languages. And that futures are more or less a way to keep track of some task you put on the event loop. They keep track of the task so you can call functions on it or cancel it, and hold the result when it’s done.
For a “real world” example, if you take a silk shirt to the dry cleaner’s, they will give you a paper receipt with a number to keep track of the order. That slip is a future. You’ll use it to get the result of your dry cleaning (clean shirt!), and you can call the dry cleaner with that number if you have instructions (“Cancel it!” Or, “I’m allergic to that chemical!” Or, “I’ll be out of town, how long will you hold on to it?”).
But what if you don’t want to explicitly communicate with a feature, and just let your function/coroutine work like a regular function, where you have input, output, and take a couple breaks in the middle? Enter tasks.
A task is a type of future with a coroutine attached, so that the event loop can tell you the status of the task and give you a place to talk to your coroutine. In practice, a task is the future the event loop hands back when you run a task. To me, a task feels even more like that dry cleaning receipt, in that it’s an object with references to something I’m trusting to the event loop! And while above, I explicitly made the future, here, the Asyncio library made it for me.
Above, we see that our coroutine explicitly becomes a future when we call loop.create_task, and when we pass that task to print_future by a callback, print_future sees a future with our coroutine’s output in it.
So there’s a lot of somewhat confusing language to asyncio, but it’s not really that alien once you get the metaphors the developers were going for.