Making an Unlimited Number of Requests with Python aiohttp + pypeln
This post is a continuation on the works of Paweł Miech’s Making 1 million requests with python-aiohttp and Andy Balaam’s Making 100 million requests with Python aiohttp. I will be trying to reproduce the setup on Andy’s blog with some minor modifications due to API changes in the
aiohttp library, you should definitely read his blog, but I’ll give a recap.
UPDATE: Since Andy’s original post,
aiohttp introduced another API change which limited the total number of simultaneous requests to
100 by default. I’ve updated the code shown here to remove this limit and increased the number of total requests to compensate. Apart from that, the analysis remains the same.
UPDATE 02/18/2020: Updated examples to
First of all we create a simple synthetic server named
server.py to which we will run our requests against.
We also create a bash script named
timed.sh to measure the time of our scripts as well as memory consumption and cpu usage:
Now we will create some clients using
aiohttp. All these clients will have an input number of requests given by
int(sys.argv) and use a function called
fetch to make an async
GET request to the server followed by an async
read of the response.
Paweł Miech’s approach uses a semaphore before calling
fetch to limit the number of requests, here is a the version that Andy uses on his blog
The flaw with this code is that while the http requests are in fact limited, the number of tasks aren’t. Whats more, all the task objects are created before being gathered and ran, this is not efficient in memory or time since you have to allocate space for all of these objects first and while you do so no requests are being made.
Note: The only modification made here, which you will see throughout all the scripts, is that
ClientSession in the newer version of
aiohttp has to be called with
async with instead of
Andy’s approach tries to fix the memory issue by keeping a bounded list with the current running tasks (limited by the
limit variable) and only adding a new task once a previous task finishes. Most of the action happens in the
first_to_finish async generator.
With this approach we are able actually make an unbounded number of requests! There is a problem however, the
first_to_finish function is highly inefficient in computation because of 2 reasons:
- It uses
asyncio.sleep(0)to constantly monitor the current running tasks
- It has to run over each running task and check if its done.
We will see latter that this results in high cpu usage.
From here on I will show the approach I took while developing the
io module for the pypeln library. The main idea here will be to also use a semaphore, but it will be usedt to limit the creation of the tasks instead of requests, this avoids having to continuously monitor the running tasks. We will encapsulate all the main logic in a class called
TaskPool we will define below
Much like a
TaskPool class has a
put method that creates tasks from coroutines, but uses a semaphore before actually calling
asyncio.ensure_future to trigger the task. However, instead of using the semaphore via
async with, we are calling the methods
release manually because we don’t know when the task will end. The trick to get this working efficiently is adding a callback to the task’s
on_done event so that it releases the semaphore for us, this is highly efficient because we don’t have to constantly scan tasks checking if they are done as in Andy’s approach.
TaskPool is used as an async context manager so that when the context is being exited, we await on the remaining tasks using
asyncio.gather. This class can be imported form
pypeln for whoever wants to make use of it; using it to our advantage we can solve the problem quite easily
Basically we just have to pass the
limit variable to the constructor of the
TaskPool and then inside the main loop
await on the
put method, passing it the
fetch coroutine. Same as in Andy’s approach we are also able to make an unbounded/unlimited amount of requests if we so desired thanks to the efficient resource management.
You can also use the
pypeln.asyncio_task.each function to simplify code a bit
Here we have created a generator called
urls iterates to make things clearer.
each iterates over
urls concurrently and runs the
fetch coroutine on each element (works much like the
map function from the standard library except it doesn’t return any values). We set the
workers parameter to
limit (under the hood this creates a
TaskPool which receives the parameter), and use the
on_done callbacks to handle the
aiohttp.ClientSession which is passed as a parameter to
I am going to run each of the clients described here in order with
100_000 requests (for the sake of time) using the
➜ bash timed.sh python client-async-sem.py 100_000
Memory usage: 352684KB Time: 154.87 seconds CPU usage: 38%➜ bash timed.sh python client-async-as-completed.py 100_000
Memory usage: 57548KB Time: 154.91 seconds CPU usage: 100%➜ bash timed.sh python client-task-pool.py 100_000
Memory usage: 58188KB Time: 153.40 seconds CPU usage: 36%➜ bash timed.sh python client-pypeln-io.py 100_000
Memory usage: 63624KB Time: 154.39 seconds CPU usage: 37%
A few things to note:
- Paweł Miech’s semaphore approach (
client-async-sem.py) has a higher memory usage (almost 10x) and would blow up if we put a bigger number like e.g 100 million, although for this case its time is good and the CPU usage is low.
- Andy’s continuous monitoring approach (
client-async-as-completed.py) uses the least amount of memory, but its CPU consumption is excessive and uses 100% of one of the cores.
- Both the pure
client-task-pool.py) and the
client-pypeln-io.py) approaches have fairly similar metrics, they are equally fast, memory efficient, and have low CPU usage; possibly the best methods judging by the numbers.
- If you truly want to make an unlimited number of requests you can use a iterable/generator that doesn’t terminate instead of
asyncio module and the new
async/await syntax enables us to create very powerful IO programs with Python that were once only in the grasp of languages like Erlang/Elixir, Go, or even Node.js. However, some things are hard to get right specially since there is very little material out there, libraries are just being made for these kind of tasks, and the paradigm by itself is quite different.
I hope this post is useful to those wanting to do high-performance IO applications in Python. Thanks to Andy Balaam for his post which served as an inspiration when implementing my code and for his feedback.
In the future I want to make a more real world benchmark which involves downloading, resizing, and storing a huge amount of images.