Brief introduction about the types of worker in gunicorn and respective suitable scenario

Genchi Lu
Genchi Lu
Jan 30, 2018 · 6 min read

In Python 2.7, Gunciorn provides serval types of worker: sync, gthread, eventlet, gevent and tornado.

I classify them into three categories according to how they were implemented to handle a request.

  1. Per request a process: while the type of worker is set to sync, gunicorn would delegate one process for each request.
  2. Per request a thread: while the type of worker is set to gthread, gunicorn would delegate one thread fork from a process for each request.
  3. Async IO: while the type of worker is set to evenlet, gevent or tarnado, gunicorn would handle multiple requests at one process with async IO.

Below is a snip code with two simple tasks, one would sleep 2 sec to simulate an IO-bound task, the other would just calculate multiplication to simulate a CPU-bound task. In the following article, I would briefly explain each category of the type of worker’s performance while running this two task.

Per request a process

When the type of worker is set to sync, gunicorn prepares worker processes on startup. In theory, the maximum concurrency number at the same number of the worker process. Below, when gunicorn startup, it prepares one process with PID 569 to handle incoming requests when startup. It could only handle one request at a time:

$> gunicorn -w 1 -k sync HelloWorld.wsgi:application -b 192.168.55.100:80[2018-01-29 16:35:05 +0000] [564] [INFO] Starting gunicorn 19.7.1[2018-01-29 16:35:05 +0000] [564] [INFO] Listening at:  (564)[2018-01-29 16:35:05 +0000] [564] [INFO] Using worker: sync[2018-01-29 16:35:05 +0000] [569] [INFO] Booting worker with pid: 569

Then I use siege to test each task by sending two requests simultaneously. You can observe that the second request was blocked by the first request.

// Test IO-bound task$> siege -c 2 -r 1  -v** SIEGE 3.0.5** Preparing 2 concurrent users for battle.The server is now under siege...HTTP/1.1 200   3.01 secs:      22 bytes ==> GET  /ioTask// Next request was blockedHTTP/1.1 200   5.02 secs:      22 bytes ==> GET  /ioTask............// Test CPU-bound task$> siege -c 2 -r 1  -v** SIEGE 3.0.5** Preparing 2 concurrent users for battle.The server is now under siege...HTTP/1.1 200   1.47 secs:      23 bytes ==> GET  /cpuTask// Next request was blockedHTTP/1.1 200   3.09 secs:      23 bytes ==> GET  /cpuTask

The advantage is high error separation. If one process crashed it would only affect the request that process was handling, and other requests would not be affected.

The disadvantage is that a process needs more resources in OS. As the number of processes grows up, it would consume too much CPU and memory which is not necessary. Thus it would lower concurrency when using sync.

Per request a thread

When the type of worker is set to gthread, gunicorn prepares the worker process on startup. When a request is coming, one of these processes would fork a thread to handle that request. In theory, the maximum concurrency number equals to the number of the worker process times the number of threads. Below, when gunicorn startup, it prepares one process with PID 569 when startup and the number of threads are set to 2. Theoretically, it could handle 2 requests one time theoretically:

$> gunicorn -w 1 -k gthread --thread=2 HelloWorld.wsgi:application -b 192.168.55.100:80[2018-01-29 16:50:21 +0000] [590] [INFO] Starting gunicorn 19.7.1[2018-01-29 16:50:21 +0000] [590] [INFO] Listening at:  (590)[2018-01-29 16:50:21 +0000] [590] [INFO] Using worker: gthread[2018-01-29 16:50:21 +0000] [595] [INFO] Booting worker with pid: 595

Using siege to test each task by sending 4 requests simultaneously. You can observe that requests began to be blocked when the third request coming in.

// Test IO-bound task$> siege -c 4 -r 1  -v** SIEGE 3.0.5** Preparing 4 concurrent users for battle.The server is now under siege...HTTP/1.1 200   2.00 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   2.00 secs:      22 bytes ==> GET  /ioTask// Next request was blockedHTTP/1.1 200   4.01 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   4.01 secs:      22 bytes ==> GET  /ioTask
............// Test CPU-bound task$> siege -c 4 -r 1  -v** SIEGE 3.0.5** Preparing 4 concurrent users for battle.The server is now under siege...HTTP/1.1 200   3.00 secs:      23 bytes ==> GET  /cpuTaskHTTP/1.1 200   3.32 secs:      23 bytes ==> GET  /cpuTask// Next request was blockedHTTP/1.1 200   5.20 secs:      23 bytes ==> GET  /cpuTaskHTTP/1.1 200   5.44 secs:      23 bytes ==> GET  /cpuTask

The advantage is that it would have higher concurrency than request per process. But the number of thread is still limit in OS, it still would affect system if there were too many threads.

Using async IO to handle each request

When the type of worker is set to eventlet, gevent, or tarnado, multiple requests would be handled by one process which was using async IO. That process would not wait for IO and continue to handle other requests until that IO is completed. In theory, it has unlimited concurrency. I use gevent as an example: when gunicorn startup, it prepares one process with PID 733 to handle incoming requests.

$> gunicorn -w 1 -k gevent HelloWorld.wsgi:application -b 192.168.55.100:80[2018-01-29 17:11:03 +0000] [728] [INFO] Starting gunicorn 19.7.1[2018-01-29 17:11:03 +0000] [728] [INFO] Listening at:  (728)[2018-01-29 17:11:03 +0000] [728] [INFO] Using worker: gevent[2018-01-29 17:11:03 +0000] [733] [INFO] Booting worker with pid: 733

Using siege to send 10 concurrency request to IO-bound task simultaneously, there are no requests were blocked:

// Test IO-bound task$> siege -c 10 -r 1  -v** SIEGE 3.0.5** Preparing 10 concurrent users for battle.The server is now under siege...HTTP/1.1 200   2.01 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   2.02 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   2.02 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   2.02 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   2.01 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   2.01 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   2.01 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   2.02 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   2.02 secs:      22 bytes ==> GET  /ioTaskHTTP/1.1 200   2.02 secs:      22 bytes ==> GET  /ioTask

But when sending 10 concurrency request to CPU-bound task simultaneously, it would behave just like sync. Except for the first request, other requests were blocked by the previous request.

// Test CPU-bound task$> siege -c 10 -r 1 http://192.168.55.100/cpuTask -v** SIEGE 3.0.5** Preparing 10 concurrent users for battle.The server is now under siege...HTTP/1.1 200   1.61 secs:      23 bytes ==> GET  /cpuTask// Next request was blockedHTTP/1.1 200   3.20 secs:      23 bytes ==> GET  /cpuTaskHTTP/1.1 200   4.88 secs:      23 bytes ==> GET  /cpuTaskHTTP/1.1 200   6.38 secs:      23 bytes ==> GET  /cpuTaskHTTP/1.1 200   6.97 secs:      23 bytes ==> GET  /cpuTaskHTTP/1.1 200   8.60 secs:      23 bytes ==> GET  /cpuTaskHTTP/1.1 200  10.12 secs:      23 bytes ==> GET  /cpuTaskHTTP/1.1 200  11.74 secs:      23 bytes ==> GET  /cpuTaskHTTP/1.1 200  13.25 secs:      23 bytes ==> GET  /cpuTaskHTTP/1.1 200  14.72 secs:      23 bytes ==> GET  /cpuTask

It very clear that while using async IO to handle IO-bound requests is very efficient, but it would have lower concurrency when handling CPU-bound requests.

Conclusion

I think that we must define scenario first before choosing the type of worker:

If you want a stable system and hope that an exception in one request would not affect other requests, sync is what you need.

If the tasks in your app are almost IO-bound, then async IO is good for you.

If most of all tasks in your app are CPU-bound, then you should consider gthread first.

Genchi Lu

Written by

Genchi Lu

I am Genchi, a backend engineer in Taiwan.