Making an Unlimited Number of Requests with Python aiohttp + pypeln

This post is a continuation on the works of Paweł Miech’s Making 1 million requests with python-aiohttp and Andy Balaam’s Making 100 million requests with Python aiohttp. I will be trying to reproduce the setup on Andy’s blog with some minor modifications due to API changes in the aiohttp library, you should definitely read his blog, but I’ll give a recap.

UPDATE: Since Andy’s original post, aiohttp introduced another API change which limited the total number of simultaneous requests to 100 by default. I’ve updated the code shown here to remove this limit and increased the number of total requests to compensate. Apart from that, the analysis remains the same.

First of all we create a simple synthetic server named to which we will run our requests against.

We also create a bash script named to measure the time of our scripts as well as memory consumption and cpu usage:

Now we will create some clients using aiohttp. All these clients will have an input number of requests given by int(sys.argv[1]) and use a function called fetch to make an async GET request to the server followed by an async read of the response.

Paweł Miech’s approach uses a semaphore before calling fetch to limit the number of requests, here is a the version that Andy uses on his blog

The flaw with this code is that while the http requests are in fact limited, the number of tasks aren’t. Whats more, all the task objects are created before being gathered and ran, this is not efficient in memory or time since you have to allocate space for all of these objects first and while you do so no requests are being made.

Note: The only modification made here, which you will see throughout all the scripts, is that ClientSession in the newer version of aiohttp has to be called with async with instead of with.

Andy’s approach tries to fix the memory issue by keeping a bounded list with the current running tasks (limited by the limit variable) and only adding a new task once a previous task finishes. Most of the action happens in the first_to_finish async generator.

With this approach we are able actually make an unbounded number of requests! There is a problem however, the first_to_finish function is highly inefficient in computation because of 2 reasons:

  • It uses asyncio.sleep(0) to constantly monitor the current running tasks
  • It has to run over each running task and check if its done.

We will see latter that this results in high cpu usage.

From here on I will show the approach I took while developing the io module for the pypeln library. The main idea here will be to also use a semaphore, but it will be usedt to limit the creation of the tasks instead of requests, this avoids having to continuously monitor the running tasks. We will encapsulate all the main logic in a class called TaskPool we will define below

Much like a Queue, this TaskPool class has a put method that creates tasks from coroutines, but uses a semaphore before actually calling asyncio.ensure_future to trigger the task. However, instead of using the semaphore via async with, we are calling the methods acquire and release manually because we don’t know when the task will end. The trick to get this working efficiently is adding a callback to the task’s on_done event so that it releases the semaphore for us, this is highly efficient because we don’t have to constantly scan tasks checking if they are done as in Andy’s approach.

TaskPool is used as an async context manager so that when the context is being exited, we await on the remaining tasks using asyncio.gather. This class can be imported form pypeln for whoever wants to make use of it; using it to our advantage we can solve the problem quite easily

Basically we just have to pass the limit variable to the constructor of the TaskPool and then inside the main loop await on the put method, passing it the fetch coroutine. Same as in Andy’s approach we are also able to make an unbounded/unlimited amount of requests if we so desired thanks to the efficient resource management.

You can also use the pypeln.asyncio_task.each function to simplify code a bit

Here we have created a generator called urls iterates to make things clearer. each iterates over urls concurrently and runs the fetch coroutine on each element (works much like the map function from the standard library except it doesn’t return any values). We set the workers parameter to limit (under the hood this creates a TaskPool which receives the parameter), and use the on_start and on_done callbacks to handle the aiohttp.ClientSession which is passed as a parameter to fetch.


I am going to run each of the clients described here in order with 100_000 requests (for the sake of time) using the script.

➜ bash python 100_000
Memory usage: 352684KB Time: 154.87 seconds CPU usage: 38%
➜ bash python 100_000
Memory usage: 57548KB Time: 154.91 seconds CPU usage: 100%
➜ bash python 100_000
Memory usage: 58188KB Time: 153.40 seconds CPU usage: 36%
➜ bash python 100_000
Memory usage: 63624KB Time: 154.39 seconds CPU usage: 37%

A few things to note:

  • Paweł Miech’s semaphore approach ( has a higher memory usage (almost 10x) and would blow up if we put a bigger number like e.g 100 million, although for this case its time is good and the CPU usage is low.
  • Andy’s continuous monitoring approach ( uses the least amount of memory, but its CPU consumption is excessive and uses 100% of one of the cores.
  • Both the pure TaskPool ( and the pypeln.asyncio_task.each ( approaches have fairly similar metrics, they are equally fast, memory efficient, and have low CPU usage; possibly the best methods judging by the numbers.
  • If you truly want to make an unlimited number of requests you can use a iterable/generator that doesn’t terminate instead of range.


The asyncio module and the new async/await syntax enables us to create very powerful IO programs with Python that were once only in the grasp of languages like Erlang/Elixir, Go, or even Node.js. However, some things are hard to get right specially since there is very little material out there, libraries are just being made for these kind of tasks, and the paradigm by itself is quite different.

I hope this post is useful to those wanting to do high-performance IO applications in Python. Thanks to Andy Balaam for his post which served as an inspiration when implementing my code and for his feedback.

In the future I want to make a more real world benchmark which involves downloading, resizing, and storing a huge amount of images.