I Am Gunicorn

4 min readMay 17, 2022

Most standard web servers such as Apache don’t know how to handle requests to Python applications. To solve this we need a piece of software that takes the requests received from a client and forwards them onto our Python application which will generate a response for the server to return back to the client. Historically there was no set of standards for how this should be operate which lead to WSGI (Web Server Gateway Interface), which is a set of rules which allow a WSGI compliant server to work with a WSGI compliant Python application. The benefits of it are the freedom to swap any WSGI compliant server out without having to alter the WSGI compliant Python application, and vice versa. WSGI also promotes scaling for web servers to be able to handle thousands of requests to a Python application and allows you to easily tailor it’s functionality with middleware components for authentication, caching, filtering and so on.

Gunicorn is one implementation of a WSGI server for Python applications.

What is Gunicorn?

As mentioned above Gunicorn is a WSGI compliant web server for Python applications that receives requests made to the web server from a client and forwards them onto the Python application or web framework (such as Flask or Django) in order to run the appropriate application code for the request.

In order to run a Python application using Gunicorn then the application needs to be WSGI compliant. What this means is that it needs to provide a callable function that takes a defined pair of variables that will be passed from the server with every request, for example: def app(environ, start_response).

With Gunicorn running, when it receives a HTTP request from a client it packages the HTTP details (such as the route, method and request body) up in an environ variable which it then passes through to this defined callable app() function in the applications code. Within the application or framework the HTTP details are then used to make sure that the appropriate code assigned to the requested route is run with all of the necessary details provided in the request. Once the application logic has run and generated a response, then this is passed back to the client via Gunicorn.

All of this is done using Gunicorn workers which are processes that handle passing requests through to the Python application and returning the generated response. The number of workers Gunicorn uses is defined as part of it’s configuration, or when Gunicorn is started up.

Gunicorn uses a pre-fork worker model. What this means is that a single master process is used to spin up these workers that are used to handle requests coming into the server (usually between 4 and 12 workers are all that is needed to handle thousands of requests per second), however the master process knows nothing about how these workers operate. These worker processes can be sync or async:

sync workers handle a single request at a time and the connection is closed when the response is sent back to the client.
async workers use greenlets to handle multiple requests at a time. However unlike proper threads greenlets are co-operative and sequential and so only one can be active at a time. Processing switches between them on demand from the server with the state or stack or each greenlet being saved for when it is later resumed.

How do you Setup Gunicorn?

Gunicorn is installed either manually through the command line with Pip or it can be defined as part of an applications configuration in requirements.txt:

pip install gunicorn

Within the applications code you will also need to create the callable function that will be Gunicorns entry point into the application, so for example you might define a wsgi.py file with the following code depending on if you were running a web framework or not:

Without a web framework:

def app(environ, start_response):
    data = b"Hello, World!\n"
    start_response("200 OK", [
        ("Content-Type", "text/plain"),
        ("Content-Length", str(len(data)))
    ])
    return iter([data])

Using Flask web framework:

from myproject import appif __name__ == "__main__":
    app.run()

When you boot up the WSGI server then you provide the callable function that is used as the link between the WSGI server and the WSGI compliant application (you can also see the number of workers is specified when starting the server):

gunicorn --bind 127.0.0.1:5000 --workers=2 wsgi:app

The following is example output generated when starting Gunicorn you can see the url it is listening on, what type of workers it is using and when it boots up those workers to handle requests:

[2021-11-19 23:07:57 +0000] [8760] [INFO] Starting gunicorn 20.1.0
[2021-11-19 23:07:57 +0000] [8760] [INFO] Listening at: http://127.0.0.1:5000 (8760)
[2021-11-19 23:07:57 +0000] [8760] [INFO] Using worker: sync
[2021-11-19 23:07:57 +0000] [8763] [INFO] Booting worker with pid: 8763

Useful Links

Below are some resources that I found really helpful when getting to grips with Gunicorn and putting together this quick guide:

Basics of WSGI

WSGI for Web Developers

Serving Flask Applications with Gunicorn and Nginx

Gunicorn Documentation

I Am Gunicorn

What is Gunicorn?

How do you Setup Gunicorn?

Useful Links

Written by Nigel Pain