Selecting gunicorn worker types for different python web applications.

6 min readMar 18, 2020

TL;DR, practical advices on selecting gunicorn worker types for better performance.
1. request per process: recommended for high CPU bounded application, memory usage is not a big concern and the number of concurrent requests does not really matter (only parallel requests are focused). The simplicity of coding is an advantage for this approach, but bad handling 3rd party timeout can block the whole app.
2. request per thread: recommended for high CPU bounded application. Lower memory usage is a plus point compared to the first option, but source code needs to be thread-safe and 3rd party timeout is still a potential blocker.
3. request per coroutines: good for I/O bounded application and handle high number of concurrent requests. However, you may encounter some tricky bugs and potential issues with persistent connection handling.

Production python web application stack

Normally, a python web application is deployed in production with three main components:

A web server (like nginx): host static files, handle http connections
A WSGI HTTP server (like Gunicorn): invoke multiple processes of the web application and distribute the load among them.
A web application: an actual logic application written using web framework (such as Django, Flask).

Selecting gunicorn worker types for different python web applications.

Production python web application stack

Written by Tuan Nhu Dinh