A simple introduction to Python’s asyncio
This is a no-buzzword first principles introduction to the asyncio library in Python.
If you’ve come here, it is likely that you have heard of words such as asynchronous, concurrency and parallelism. Before we start off with asyncio, lets quickly get some basic things about these words right (via examples), so that we have a solid foundation to build this upon.
Concurrency is like having two threads running on a single core CPU. Instructions from each thread could be interleaved, but at any given time, only one of the two threads is actively making progress.
Parallelism is like having two threads running simultaneously on different cores of a multi-core CPU.
It is important to note that parallelism implies concurrency but not the other way round.
Asynchronous is a higher level programming concept, where you fire off some task, and decide that while you don’t have the result of that task, you are better off doing some other work instead of waiting.
When you do things asynchronously, you are, by definition implying concurrency between those things.
Why asynchronous programming?
Why do we want to write asynchronous programs you say — because it could increase the performance of your program many many times. Imagine you have a single core machine you are running your app on. You receive a request, and you need to make two database queries to fulfil that request. Each query takes 50ms of time. With a synchronous program, you would make the second request only after completing the first — total time 100ms. With an asynchronous program, you could fire off both the queries one after the other — total time 50ms.
asyncio
Asyncio is all about writing asynchronous programs in Python. Asyncio is a beautiful symphony between an Event loop, Tasks and Coroutines all coming together so perfectly — its going to make you cry.
The Event Loop
This is what makes it all possible — a simple loop, thats it. Well not that simple. But here is how it works. The event loop is the orchestrator of the symphony. It runs tasks one after the other. At any given time, only one of the tasks is running.
As you can imagine, there is a lot of pressure on the active task, since other tasks are waiting for their turn. So, when the active task makes a blocking call, say a network request, and cannot make further progress it gives the control back to the event loop realising that some other task could possibly better utilise the event loop’s time. It also tells the event loop what exactly it is blocked upon, so that when the network response comes, the event loop can consider giving it time to run again.
The event loop time is precious. If you are not making progress, you should step off the loop, so that someone else can. Event loop is the measure of progress.
The Coroutine & Task
Coroutines (co-operative routines) are a key element of the symphony. It is the coroutines, and their co-operative nature, that enables giving up control of the event loop, when the coroutine has nothing useful to do. A coroutine is a stateful generalisation of the concept of subroutine.
A subroutine is your good old-fashioned function or method. You invoke the subroutine to perform a computation. You may invoke it again, but it does not hold state between the two invocations. Every invocation is a fresh one and same computation is performed.
A coroutine, on the other hand, is a cute little stateful widget. It looks like a subroutine, but it maintains state in between executions. In other words, when a coroutine “returns” (yields control) it simply means that it has paused its execution (with some saved state). So when you “invoke” (give control to) the coroutine subsequently, it would be correct to say that the coroutine has resumed its execution (from the saved state).
Coroutines look like a normal function, but in their behaviour they are stateful objects with
resume()
andpause()
— like methods.
In Python 3.5+, the way a coroutine pauses itself is using the await
keyword. Inside a coroutine, when you await
on another coroutine, you step off the event loop and schedule the awaited coroutine to run immediately. That is, an await other_coroutine
inside a coroutine will pause it, and schedule the coroutine other_coroutine
to run immediately.
Note that the event loop does not preempt a running coroutine. Only a coroutine can pause itself.
Below is a very simple example (Python 3.5+) of how coroutines cooperate with each other. We will use a pre-defined coroutine asyncio.sleep
to help us simulate blocking tasks for this example, but it could be anything in a real world scenario like a network request, db query etc.
Note that the code runs in a single thread and yet, the output will have interleaved print statements. This happens because when a coroutine gets blocked, it steps off the loop, so that the other one can run (yay! asynchronous programming with asyncio).
Some points to note
- Calling a coroutine definition does not execute it. It initialises a coroutine object. You
await
on coroutine objects, not coroutine definition as you can see inline 8
andline 17
above. - Event loop runs tasks, not coroutine objects directly. Tasks are a wrapper around coroutine objects. When you write
await coroutine_object
you essentially schedule a wrapper task to be run on the event loop immediately. asyncio.sleep
is a coroutine as well, provided by the asyncio library.asyncio.sleep(2)
initialises a coroutine object with a value of 2 seconds. When youawait
on it, you give control of the event loop to it. Sleep coroutine is smart and does not block the loop. It immediately releases control, simply asking the loop to wake it up after the specified time. When the time expires, it is given back the control and it immediately returns, thereby unblocking its caller (in the above examplecoroutine_1
or thecoroutine_2
).- The above example had three different types of coroutines that ran on the event loop —
coroutine_1
,coroutine_2
andasyncio.sleep
. However, four different tasks ran on the loop, corresponding to the following coroutine objects —coroutine_1()
andcoroutine_2()
scheduled atline 25
,asyncio.sleep(4)
scheduled atline 8
andasyncio.sleep(5)
scheduled atline 17
. - Another way to schedule tasks (though not immediately) on the loop is using the
ensure_future()
or theAbstractEventLoop.create_task()
methods, both of which accept a coroutine object. Example code in the end demonstrates these methods.
A more realistic yet simple example
Python at ArchSaber
At ArchSaber one of our aim has always been to dig insights deep from the application code of our customers. A lot of our clients depend upon our APM solution for Python. As a result we make great efforts in understanding the intricacies of the language and the frameworks around it. We ourselves rely heavily on Python — a lot of our analytics engine and ML code is written in Python, through which we push real-time root cause analysis to our clients’ production issues.
Thanks for reading. If you like this post, please subscribe and share.