VeepeeTech
VeepeeTech
Published in
6 min readJul 16, 2018

--

Samy AÏT-OUAKLI and Anne-Sophie TANGUY are both interns at the R&D Laboratory EPITECH of vente-privee and third-year students at EPITECH. Their work revolves around researching and developing projects proposed by different vente-privee’ product teams.

Bottle WSGI servers and Unix sockets in Python

What if you were to use Bottle’s servers with Unix socket? Crazy you would say. Now let’s say you wanted to increase these servers’ productivity. Well you might want to encapsulate them to make something better which could deal with more requests/second. If the solution you found requires Bottle’s WSGI servers to start with Unix sockets you’re on the right place.

Mono processing servers

We’ve tried for you some servers that run with bottle. The tests were made with the testing framework Locust. Locust sending 1000 users simulated waiting from 500ms to 900ms between the execution of tasks per user.

Bjoern and Meinheld are both asynchronous servers. Cherrypy and Waitress are however multi-threaded.

Tests made on an Amazon ec2 with 8 cores and 16GB of RAM.

They all have quite a descent average response time, but they can’t send many requests per second. This is where the trouble started. We needed more, way more requests treated by second.

While monitoring the CPU usage of the servers, we noticed only one core was used (and at full capacity) even though the server was multithreaded. Funnily enough the multithreaded servers (Cherrypy and Waitress in the previous example) are even slower than the others. That’s because of the GIL.

Have you heard of the GIL? The Global Interpreter Lock of python, it’s the one causing you so much trouble with your threads. This is a mechanism used in python to synchronize the execution of threads so that only one native thread can execute at a time, therefore highly decreasing the performance of threads on multicore systems.

First try with TCP/IP sockets

We thought that if the GIL was a problem, we should then go for a program that encapsulates these servers and run many of them thanks to multiprocessing, but we’ll see this part later.

For this program we wanted to act as a proxy: get requests from clients, send them to the servers, get their answer and send it back to the clients. To act like this we needed a communication between our program and the servers to be done.

We opted for a TCP/IP socket at first. This way we could make our program portable. But this could lead to people getting access to this socket from outside and ruin the dispatching of the requests.

Second try with Unix sockets

We realized we should have used Unix sockets from the beginning instead of TCP/IP. This is where you tell yourself: Of course, silly. You should have done that from the start! You do not communicate over network for this part.

But still, why would Unix sockets be better than TCP/IP ones? Unix socket are an inter-process mechanism that allows bidirectional data exchange between processes running on the same machine. UNIX domain sockets use the file system as the address name space. This means you can use UNIX file permissions to control access to communicate with them. Whereas TCP/IP sockets are looped back localhost connections. In short, Unix sockets will have less copying of data and fewer context switches so Unix sockets should have a better performance overall.

Now comes the funny part: Using bottle’s WSGI servers with Unix sockets.

You might desperately try to find documentation on these servers, however they are almost never used as such. The first solution would be that you read the source code, the second is to read this article.

Here are some example of these servers starting on a Unix socket. Some of them let you add options with their run() or server() function so we’ll leave the research of their specific options to you.

We consider “socket_path” as your path to the UNIX socket (simple string) and “wsgi_application” as your WSGI application.

Here is a simple WSGI application Hello World so you can test them yourself:

Meinheld (Asynchronous):

Better encode your “socket_path” if you don’t want to encounter a “TypeError: args tuple or string(path)”.

Link to GitHub

Link to documentation

Waitress (Multi-threaded):

Link to GitHub

Link to documentation

Cherrypy (Multi-threaded):

This cherrypy is still taking bottle’s WSGI applications. The cherrypy version we used for this example is cherrypy 8.9.1.

Link to GitHub (This is the latest Cherrypy)

Link to documentation

Cheroot (Multi-threaded):

Link to GitHub

Link to documentation

Gunicorn (Multi-process):

With Gunicorn you will want to have a class inheriting its Application’s one as such. Handler’s set up to your application and config dictionary set up with the key “bind” and the value “unix:” added to your socket path. Once this class is encapsulated you can start it with a run method.

Link to GitHub

Link to documentation

Bjoern (Asynchronous):

Link to GitHub

Eventlet (Asynchronous):

Link to GitHub

Link to documentation

Tornado (Asynchronous):

Link to GitHub

Link to documentation

Twisted (Asynchronous):

Twisted will have you make and start its thread pool. How rude of it. With it you’ll add an event listener for the pool to stop when it’s triggered. You then need to create a factory with your WSGI application, using listenUNIX with your socket path and newly created factory and finally to start running the server with its run method.

Link to GitHub

Link to documentation

Improvements with multiprocess

After setting up all those servers within your project you might encounter a new issue. Let’s say you want to connect multiple instances of the same server to one Unix socket, at first you would say: No matter, a socket can handle multiple connections at once. Unless those servers try to create said socket and bind to the same Unix socket multiple times. This will cause a bind error. For those ones you will need one Unix socket per server instance. For the others that check if they already bound with the Unix socket you shoudln’t encounter this issue.

Here is the list of those which will put you in trouble:
Bjoern, Eventlet, Tornado and Twisted.

Note that using Twisted with multithread won’t be as easy as the other ones.
In fact, Twisted has its own event-driven way of running sub-processes. You’ll need to use spawnProcess API to handle the output from sub-processes. If you stick with multiprocessing python library, you’ll need to develop your own queue-based way of getting output from a Twisted sub-process.

For the tests to come we’ll need to have locust using multiprocessing since one machine was not enough to simulate a number of users big enough to stress test the program. For this you need to first launch locust

And then launch its children processes

Here is the result with Meinheld running on Terpan versus Meinheld, both running on bottle:

This time locust launched 7000 users each waiting between 500ms and 900ms before sending a request:

Test made on an 8 core, 16GB RAM ec2.

As you can see we managed to have way more requests per second than Meinheld alone thanks to multiprocessing. Note that Terpan is able to use the whole capacity of the computer because it uses all the cores compared to Meinheld using all the capacity of a single core, giving Terpan the ability to be scaled horizontally by upgrading the number of cores.

Some servers have no way to connect to a Unix socket. However, if you find more do not hesitate to share it in the comments below.

We hope this saved you search time or gave you some ideas.

_________

Sources:

Author: Ait OUAKLI SAMY and Anne-Sophie Tanguy

--

--

VeepeeTech
VeepeeTech

VeepeeTech is one of the biggest tech communities in the retail industry in Europe. If you feel ready to compete with most of the best IT talent, join us.