10,000 times faster than MySQL?

Keep it simple

Published in

Brigade Engineering

4 min readNov 5, 2015

At Brigade we are building easy and effective ways for people to declare their beliefs, organize with others, and take action to shape the policies affecting their lives. Our initial tool allows users to express whether they agree or disagree with opinion statements about a range of topics.

Since we released earlier this summer, users have taken millions of such positions on everything from immigration and Internet policy to the environment and the economy.

Our architecture started off like many new companies — a basic Rails app backed by a MySQL database. Keep it simple and don’t optimize prematurely: great mantras to live by.

One of our core early offerings is to help users find alignment with other users on the basis of opinions they’ve shared. We do this to determine if you and your friends, family, and neighbors agree on, say, campaign reform. We compare your answers to a series of opinion statements (which we call positions) to a set of other users. We did this via a somewhat complex database join. A simplified version of this join looks like:

This query worked fine when we started out, but as we grew, it didn’t scale. While the median latency for this query was reasonable, the 99th percentile could take 10 seconds or more. The Brigade Engineering team strives to build fast responsive apps. This was clearly too slow.

Instead of doing with a database we decided to move this computation into a separate service storing a continuously updated snapshot of the user data in RAM. For the actual computation we decided to keep the data in memory and store them in bitsets. If we were to use Scala bitsets we might represent our data as follows:

While this gives us a fast way to find out how many positions a pair of users have in common, BitSets have one major problem — they are very inefficient for our use because of the number of distinct opinions. As the Scala docs explain:

Bitsets are sets of non-negative integers which are represented as variable-size arrays of bits packed into 64-bit words. The memory footprint of a bitset is determined by the largest number stored in it.

Fortunately clever people have already solved this problem through the creation of compressed bitsets. We came across this fantastic implementation from Daniel Lemire called a Roaring Bitmap:

Roaring bitmaps are compressed bitmaps which tend to outperform conventional compressed bitmaps such as WAH, EWAH or Concise. In some instances, roaring bitmaps can be hundreds of times faster and they often offer significantly better compression. They can even be faster than uncompressed bitmaps.

After loading our production dataset into these structures we were pleasantly impressed to find that storing data in a Roaring Bitmap only used around 2.5 bytes of memory / entry. Nice! Storing a few million records only requires around 10 megabytes of RAM.

The core operation

Here is a simplified version of our code that computes overlaps for a set of users:

Performance Gains

But how fast are they? We found that with a single thread on a reasonably fast machine we can overlap around 1M bitsets / second each containing a few thousand items. This brought our 99th-percentile latency down to around 2ms!

To see the performance improvement take a look at this graph below. Below is the latency for this operation before and after rolling out the new service:

Higher Up the Stack

We were already using Spark, Scala, Kafka at Brigade for analytics work. We’ve found these three technologies helped us innovate quickly. Contemplating how to roll our own service we decided to leverage our Scala expertise by using Twitter’s amazing Finagle framework and we exposed our API into our service via Thrift. Thrift’s code generator gives us a set of language independent client libraries that were easy to use. As our service is written in Scala, we use the Scrooge compiler to produce our backend service. The client for this service is currently a Rails application so we use the standard Thrift compiler to generate a Ruby library. A simplified version of the Thrift definition for our service looks like:

To push data into our new service we used Kafka as the backbone. Kafka is a high-throughput distributed messaging queue. All updates to user records flow through Kafka and are read by a consumer in our service. Kafka is fault tolerant and most messages arrive in our service within a second of occurring.

And that’s it!

In summary:

Start with simple solutions and don’t prematurely optimize until it’s needed. — Leverage the rich ecosystem of available tools and technologies — because we used so many great open source libraries our final service is only around thousand lines of code.
Don’t be afraid to stray away from conventional databases.

10,000 times faster than MySQL?

Keep it simple

The core operation

Performance Gains

Higher Up the Stack

Written by Mark Cooper