WSGI Is Not Enough Anymore — Part I

This is the first part of a multi-part series discussing the limitation of WSGI-based Python web applications and the ways to overcome these limitations.

In this post we will explore the factors that make WSGI a less attractive option for developing web applications with Python. The topics included are the WSGI-compatible synchronous model and its lack of support for communication protocols other than HTTP.

What is wrong with WSGI?

The Python ecosystem offers a wide range of frameworks for developing web applications. Web frameworks in Python come in all flavors and shapes and with so much to choose from it is easy to find the one best suited for the intended project.

Micro frameworks such as Flask and Bottle can get you up and running in no time. Others, such as Pyramid and Django, offer a full-stack experience, with the latter being the most popular and possibly the most complete. Most of these frameworks have a rather large community of users and its easy to find tutorials and documentation on how to make the most of each. Over the years several comparisons have been made between Python web frameworks in an effort to help developers identify the one that fits their requirements and circumstances the most.

A common factor between these web frameworks (apart from all being written in Python) is that they all implement the WSGI standard for web applications written in Python.

The Web Server Gateway Interface (WSGI) is a specification which was first developed in 2003, and then revised in 2010, in order to create a standard for Python web frameworks to interact with web servers.

Such a standard decoupled the web application development from the production web server and allows more choices when deploying the app to production. Thus, to develop an app using Django or Flask, developers could deploy the app with any production-grade server which implements the WSGI specification, such as uWSGI or Gunicorn.

WSGI makes it extremely easy to place a web server such as Apache or Nginx in front of the Python web application and proxy all relevant requests to the web application.

The bad news is that WSGI comes with two major drawbacks:

  1. WSGI compatible servers are synchronous
  2. WSGI compatible servers only supports the HTTP protocol

Let’s discuss these shortcomings in detail.

Synchronous server

By design, a WSGI server is synchronous. This means it blocks each request until a response arrives from the application. Blocking requests prevent additional requests from being processed. That is why production WSGI implementations (which are multi-threaded) create a thread for each request in order to be able to handle multiple requests at the same time and thus create some level of concurrency. The following figure illustrates this problem:

WSGI applications can certainly scale since they can work in a multithreaded and a multiprocess environment. Yet, thread manipulation is an expensive OS operation and it doesn’t make any sense to create more processes than the amount of cores running on a single server.

So what happens when we reach the thread limit? nothing, we wait. Developers have eased this problem by scaling horizontally. In simple terms, it means adding more application servers to the production environment. This is done by placing a load balancer or a reverse proxy, such as Nginx, ELB or others, in front of the Python application servers and distributing the load evenly between all Python app servers.

The time it takes for the application to process a single request and return a response can vary. It depend entirely on:

  1. The resources being used (databases, cache, etc)
  2. The application logic
  3. The quality and effeciency of the code.

Developers should always try to process requests as fast and most efficient as possible and to divert long running requests to asynchronous tasks.

There are many measures developers can take in order to increase the number of requests-per-second which Python web applications running under WSGI can handle. But true concurrency is not one of them. Eventually a single web server can only scale to a certain amount before its computation resources are exhausted. This is often referred to as the C10K problem.

It is true that there are cases where large-scale web applications, serving extremely high volumes of traffic, are implemented as Python apps under WSGI.

But those web apps rely on several different kinds of optimizations, mainly horizontal scaling, without the use of concurrency, and therefore do not necessarily utilize their resources to their full extent.

A micro-services architecture enables scaling out in a non-linear way. Here services, which are considered as bottle-necks, are scaled more than others.

HTTP is the only protocol

The second limitation of WSGI servers is they only support the HTTP protocol. There is no denying that most of the web relies on HTTP for passing data between clients and servers — for a good reason. HTTP is good enough for most web applications, be it a simple blog or an enterprise app.

But, HTTP is a stateless protocol. This means that the server does not retain session information between requests coming from a single client. Each request/response cycle is independent. If there is any session information (such as user credentials) it is passed as part of each request in the form of cookies or request headers, and then matched against a cache or database.

Because of the nature of HTTP, being an application protocol rather than a transport protocol, only the client can initiate communication with the server. Only the client is aware of the server’s existence and not the other way around.

So, any information which the server needs to update its clients with (for example: stock updates) has to be polled frequently from all clients, creating a large overhead of requests. For web applications that rely on HTTP only and want data updating in real time, polling has to happen frequently.

HTML5 introduced, among other things, web sockets, which create a bi-directional communication layer between servers and clients. This mechanism lets servers send data to their clients whenever it is created or updated, not only when requested. The usage of web sockets has become increasingly popular in web applications. Some developers have even gone as far as converting their entire communication layer to web sockets, creating a complete full-duplex system. Whether this is judicious and practical is a topic for another blog post.

It is important to state, especially for programmers who are experienced with socket programming, that web sockets are not the same as real sockets. They are similar in many ways and implement a similar functionality. However, web sockets are initiated and negotiated over HTTP. From there, they are then ‘upgraded’ to web sockets.

Also, web sockets do not stream data the way regular sockets do, which makes web socket programming a lot easier. There are other things which web sockets can do (like authentication) but thats outside the scope of this blog post.

For Python developers, who develop web applications which rely on WSGI only, bi-directional communication between servers and clients becomes a challenge. Python frameworks which rely on WSGI do not implement web socket communication and must rely on 3rd party solutions and extra components and resources.

In the next post we will discuss what concurrency is, in the context of web applications, and how using a single-threaded event loop can be a smart way to develop high-performance web applications.