Asyncio, why and how?
In this blog post, we are going to explore parallel programming and concurrency taking as example Asyncio, Motor, aiohttp.
Concurrent programming:
Concurrency is a programming concept that requires breaking down a problem into small tasks in order to increase the efficiency of our program in terms of CPU usage and execution time.
Parallel programming:
Parallel programming is executing multiple tasks at the same time using different CPU cores. In other words, parallel programming is one of the methods of implementing concurrency.
So what type of problems can concurrency solve?
CPU bound problem: Is in simple words, the problem where our program is spending most of the time in computing using the CPU .
IO bound problem: Is usually referring to disk read/write operations, but it covers specially http requests and DB calls.
Which problem does asyncio solve?
Asyncio is a python library that will allow us to solve the IO bound problem, giving developers the possibility to use a single threaded program more efficiently by creating coroutines and having a full control over them.
Note that using python3.5+ the syntax is:
async/await
however using lower python versions the syntax is:
@asyncio.coroutine/yield from
well, let’s have concrete examples.
In the following examples we will use python3.6.
In the following example we will simulate an I/O operation using asyncio.sleep
import asyncio
async def print_after_sleep(msg, s):
await asyncio.sleep(s)
print("Running Task", msg)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
tasks = [asyncio.ensure_future(print_after_sleep(1, 2)), asyncio.ensure_future(print_after_sleep(2, 1)),
asyncio.ensure_future(print_after_sleep(3, 3))]
loop.run_until_complete(asyncio.wait(tasks))
loop.close()
The output will be as follows and the event loop will pick up tasks from the queue while an I/O operation is running:
Running Task 2
Running Task 1
Running Task 3
In the following example we will implement a non blocking Get endpoint using aiohttp, motor to get customers by id.
from motor.motor_asyncio import AsyncIOMotorClientfrom bson.json_util import dumps
client = AsyncIOMotorClient('localhost', 27017)db = client.customers_db
async def get_customer(request): customer_id = request.rel_url.query["customer_id"] db = request.app['db'] result = await db.customers.find_one({'customer_id': customer_id}) if not result:
return web.Response(text='customer with id:{!r} doesn't exist.'.format(customer_id)) return web.Response(body=dumps(result),
content_type="application/json")app = web.Application()
app.add_routes([web.get('/', get_customer)])
if __name__ == '__main__':
web.run_app(app)
To take away:
Concurrency is a programming paradigm while parallel programming is a specific implementation of concurrency, where multiple CPU cores are used simultaneously. “The Python Global Interpreter Lock or GIL, in simple words, is a mutex (or a lock) that allows only one thread to hold the control of the Python interpreter. … [In a] multi-processing approach where you use multiple processes instead of threads[,] each Python process gets its own Python interpreter and memory space so the GIL won’t be a problem. Python has a multiprocessing module [for this.]” [1]
“Threads [in Python; so not parallel processes] can interact in ways that are subtle and hard to detect. These interactions can cause race conditions that frequently result in random, intermittent bugs that can be quite difficult to find. … An important point of asyncio is that the tasks never give up control without intentionally doing so. They never get interrupted in the middle of an operation. This allows us to share resources a bit more easily in asyncio than in threading. You don’t have to worry about making your code thread-safe.” [2]
“asyncio takes a very, very explicit approach to asynchronous programming: only code written in methods flagged as async can call any code in an asynchronous way. Which creates a chicken/egg problem: your async methods can only be called by other async methods, so how do you call the first one? The answer: you don’t. What you have to do instead is turn over control of the task to an event loop, after arranging for the loop to (sooner or later) invoke your async code.” [3]
To use asyncio, you must use awaitables, which are objects that “can be used in an await expression,” new in Python 3.5, which can “suspend the execution of coroutine [and] can only be used inside a coroutine function.”[4] [5]
And a bonus link, on how Python async/await relates to JavaScript promises! “If you’re coming from a JavaScript background, it’s tempting to try to use the promises that you know and love with Python. [But it] turns out that promises are not pythonic: I should have used async/await instead. This post explains why async/await is a better idiom that you can use both in Python and JavaScript.” [6]
Conclusion:
In this blog post, we described the difference between concurrent and parallel programming. We also made some examples that showed how Asyncio, and libraries built on top such as aiohtpp and motor.