How we use gevent to go fast

Xun Liu | Pinterest engineer

Not too long ago we were tackling the challenge of fixing a legacy Python system and converting a two-year old single-threaded codebase with hundreds of thousands of lines of code to a multi-threaded codebase. To save us from rewriting everything from scratch, we went with gevent to make the program greenlet-safe.

With the update, Pinners can spend less time waiting and more time collecting and discovering the things they love on the site.

Here’s a look at how it all went down.

Lessons from the early days

In the first few years, we took the simplest scaling approach by building web and API servers in single-threaded mode. We developed features quickly and scaled to the fast-growing traffic, but as the traffic and the code size continued to grow, running many processes started to show its limits:

  • As more and more features were added, the footprint of our server was getting bigger.
  • As we added more backend servers to keep up with the growth, we ran the risk of having issues or slowing down. Slow requests could take a large number of processes out the pool, significantly shrinking the degree of concurrency, causing 500 errors or skyrocketing the site latency.
  • As more logics were added in code, we wanted to parallelize work (i.e. network IO) to reduce latency, however we were stuck with the single-threaded server.

Building high performance servers

The solution called for parallelized servers capable of handling multiple requests at the same time, and gevent was the answer.

Gevent is a library based on non-blocking IO (libevent/libev) and lightweight greenlets (essentially Python coroutines). Non-blocking IO means requests waiting for network IO won’t block other requests; greenlets mean we can continue to write code in synchronous style natural to Python. Together they can efficiently support a large number of connections without incurring the usual overhead (e.g. call stacks) associated with threads.

It’s easier to make code greenlet-safe than thread-safe because of the cooperative scheduling of greenlets. Unlike threads, greenlets are non-preemptive; unless the running greenlet voluntarily yields, no other greenlets can run. Keep in mind that the critical sections must not yield; if they do, they must be synchronized.

Go time: running the code

Here’s the approach we took:

1. Make blocking operations yield

Greenlets can’t be preempted, so unless it yields, no other greenlets can execute. Gevent comes with monkey_patch utility that patches the common libraries (e.g. socket, threads, time.sleep) to yield before they block. But not all libraries can be monkey-patched. For example, if a library binds on external C library (e.g. MySQL-python, pylibmc, zc.zk) that does blocking operations, it can’t be monkey-patched. For these cases we needed to replace them with their pure-python implementation (e.g. pymysql, python-memcached, kazoo).

2. Make code greenlet-safe

Because greenlets are non-preemptive, there’s usually no need to synchronize critical sections as long as they don’t yield. If a critical section yields (i.e. if we need data consistency before and after a yielding operation), we need to make them greenlet-safe by either synchronization or changing the implementation to be non-yielding.

All yielding operations themselves need to be made greenlet-safe. The most common examples are classes that make network connections (e.g. thrift, memcache, redis, s3, thrift, http, etc). If two greenlets access a socket at the same time, conflicts will cause undefined behaviors. The solution is to use connection pooling, create per-request connection, or synchronize to make them greenlet-safe.

3. Testing, testing, testing

A project like this won’t be successful without comprehensive tests. By leveraging the well-defined interface of our API server, we wrote concurrent tests that not only helped identify some subtle issues early, but also helped us fine-tune concurrency settings in production.

4. Deals with unfair scheduling

While cooperative scheduling of greenlets makes it easy to deal with critical sections, it can introduce other problems, such as unfair scheduling. If a greenlet is doing pure CPU work and doesn’t yield, other greenlets have to wait. The solution is to explicitly yield (by calling gevent.sleep(0)) during heavy processing. Moreover, running more processes can help alleviate the problem since each process gets less number of concurrent requests and process scheduling is fair as it’s done by OS.

A faster Pinterest

As of early 2013, all our Python servers (including web servers, API servers, and thrift servers that power some important services, e.g. follow service), are running on gevent. With gevent, we were able to significantly reduce the number of processes on each machine, while still getting lower latency, higher throughput, and much better resilience to spiky traffic and network problems.

Here’s to continuing to make Pinterest more efficient!

To connect with the Pinterest Engineering team, like our Facebook Page!

Xun Liu is a software engineer at Pinterest