How we moved our Food Feed from Redis to Cassandra
The Food Feed is an integral part of our social graph, as it keeps our users engaged and up-to-date with their friends’ restaurant reviews and pictures, and even offers and discounts from bookmarked restaurants. We chose Redis to power our Feed, because given the use case and support, it was our best option at the time. We’d have continued to use Redis if it could also support distributed instances for high availability and load distribution; we even tried creating our own wrappers around Redis, but we couldn’t be 100% sure that the system was fail-proof. We could have avoided outages by incurring some data loss, but that was something we simply didn’t want to do. To ensure that our system was highly available, we had to move on from Redis, and we chose Cassandra as the replacement.
Cassandra seemed perfect for this use case, as it provided implicit fail-proof options, and because it’s proven to be good for time series data — which is essentially what our Feed is. We did have a little experience working with Cassandra prior to making this big change, but certainly not enough for something as important as our Feed. We had to figure out how to make the transition to Cassandra, run the Feed as efficiently as it did on Redis, and with zero down time.
We started spending time on Cassandra, diving deep into its configuration for the first two weeks and tuning it to suit our requirements (we also bugged people on IRC, and they were really helpful!). Next, we decided on a couple of things before finalising the schema for our Feed:
- The Feed would need to be single read. With Redis, we could go on reading hundreds of keys without worrying about latency, but with Cassandra, connection latency can be a quite significant part of the app server time.
- The schema would need to be flexible enough to allow for new types of items in the Feed in the future. Given that we are constantly iterating and working on enriching the product experience, the addition of elements and features to the feed is almost inevitable.
We iterated on our schema structure and the various use cases for a couple of days, and then got to work with 2 datacenters with 3 nodes each. We had to move about 60 million keys from Redis, which would at least double once in a structured form on Cassandra. We rolled out two versions of our code, having writes on both Cassandra and Redis, but with one version reading from Cassandra and the other one from Redis.
We monitored our system for latency and other issues, and surprisingly, we hit the bottleneck in write throughput. Knowing that the prowess of Cassandra lies in its write capability, we just couldn’t achieve the write throughput we’d read about across various blog posts and articles. We knew something was wrong, but we didn’t know what. We started batching our data and running raw queries, but still had no luck. The best we were getting out of three nodes combined was 1500 writes per second. We had to roll back within a couple of hours and reassess.
After digging for a couple of days, we realised that the problem wasn’t with Cassandra, but with the Elastic Block Store (EBS). EBS is a network drive mounted on each EC2 instance, and has a shared bandwidth of 10 Gigabits with network traffic. This bandwidth, when shared among all users on a single EC2 instance, became a bottleneck for us. To cater for this, we moved our data from this network-based EBS storage to a disk storage within the same EC2 instance. We then deployed the new Food Feed, powered by Cassandra, one by one on each server so we could control throughput. Voilà — it worked, and everything was ready to go.
We then started migrating data from our production Redis server (it took us 14 hours to migrate everything) and we did it without any glitch or additional load. That’s how awesomely powerful both Redis and Cassandra are. Today, our Food Feed runs entirely on Cassandra, and we did it with zero downtime.
To sum up, here are a few things we learned while doing this:
- Avoid reads during writes. ‘Read’ throughput roughly remains the same, while ‘writes’ scale with the number of nodes.
- Avoid deletes. Deletes means compaction and while it runs, the node is useless.
- Latency is an issue. Cassandra has high connection latency compared to Redis, roughly 10x-15x.
- A network-based EBS is a bottleneck for disk IO throughput.
- Watch over JVM’s garbage collection using jstat -gc <pid> 250ms 0 (watch garbage collection every 250ms for Cassandra pid) when one is expecting > 5000 operations/sec.
- Also, keep a close watch on the disk IO using iostats.
- Increasing the number of nodes in a cluster definitely makes it more stable.
- Datastax documentation covers many best practices.