Redis caching — speed up your API
A few months ago, when eMAG started the eMAG TechTalks presentations, I was given the opportunity to show the way we’re using Redis as a caching system to speed up our API response.
I have to admit, Redis is a trending piece of software. Besides having an above average cool factor, it is fast. Real fast.
It is open source (BSD licensed) and can be used for data storage, for caching purposes or as a message broker. It also supports multiple data types such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes. Although not quite out of the box, Redis provides high availability via Redis Sentinel and data partitioning with Redis Cluster.
If you’re wondering (and you should) what’s so special about yet another in-memory storage solution, you’ll find the persistence layer provided by Redis quite handy. So, yeah, when you pull the plug on a Redis machine, you’ll be amazed to see it recover its past memories when you turn it back on.
We chose Redis because it:
- supports many data types
- has a great queuing mechanism
- has a built-in Pub/Sub (publisher / subscriber) system
For more details, check the official documentation here: http://redis.io/documentation.
We have issues
Who doesn’t? But our kind was somehow foreseen by Phil Karlton (Netscape), when he stated that “There are only two hard things in Computer Science: cache invalidation and naming things”. I’ll leave the naming things issue aside and talk more about cache invalidation, because that’s our main problem.
Everybody’s using APIs these days. The need for fast, reliable and scalable APIs is growing exponentially at eMAG, in perfect sync with the growing load, traffic and data exchange we’re coping with these days. This article is about one of those APIs, the one dealing with product documentation. You know, the product name, description, features and (sometimes missing) images.
The cache works as follows: on a new API request, the response has to be cached (as a string), so the next one that requires the same thing is served directly from memory, without querying the SQL. This is a big API optimization, leading to 100 times faster responses.
Our API always have to return fresh data, because if something’s expired in the response, the client (yeah, you!) ordering a black t-shirt could end up with a pink sweater. Has the color changed in the mighty SQL? No problem, refresh the cache! Unfortunately, color is not the only feature stored by our JSON API responses, other objects are also included in there. So, if we change one object in the DB, how could we possibly know which cached response to refresh? We can’t get all the responses from memory, decode each one and search for a specific object to update. This might be the cruelest thing we could subject a machine to.
So, we came up with the idea of parsing every response before saving it in cache, and maintaining a relationship between each response and the objects that it contains. This is the moment when one of Redis’ main features comes to the rescue: all sorts of data types. We used the string data type for storing JSON responses, one set for each response and one sorted set for each object in the response.
Basically, each response witll feature a cached response like this:
BGiven that we like to keep our systems DRY, our goal is to build a caching mechanism suitable for all kinds of APIs. That’s why the caching system must be independent, without being aware of the API logic. The API must provide the caching system with the JSON key for each object containable within a response (these will hereinafter be referred to as caching keys).
Let’s say we have the following API HTTP request parameters (POST):
And the following response:
As you can see, our response contains 5 objects: 1 x template_id and 4 x characteristic_id
One other issue we’re facing at eMAG is the concurrency handling, that’s why we must version the requests. If two requests are concurrent, each will create a new version of the response, in atomic fashion, within the Redis INCR command.
First, create a request hash:
Now, we’ll increment the request hash version in Redis:
Our first response UID is c0fee2aab4914a5d1d0d31115addc9b7:1. The number after the colon represents the latest version number for this request. You need to know that the INCR command is atomic, so two concurrent requests will generate different versions.
We want this response UIDto expire after 1h, so we’ll set the time to live (TTL) to 3600 seconds:
The SETEX command is also atomic, and is the equivalent of a SET, followed by an EXPIRE command. In this case, the JSON_RESPONSE value is shown in the above HTTP response example.
Nothing fancy so far, but we’ll get our hands dirty in a bit. As I’ve said, we need to link the response with the included objects. You know, just like in a RDBMS, we’ll create many-to-many relationships (one response UID comprises of many objects and each object comprises many response UIDs) between the response and the cached objects. For the link between the object and the response UID we’ll use a Redis sorted set (ZADD command), because this data type handles loads of updates very well (each time the response version increases, we’ll also update the object link). The scoring feature of the sorted set is not used, that’s why all of them feature a score of 1:
We’ll also build a reverse link between the response UID and the objects, you’ll see how useful it is as soon as we invalidate the response. For this relationship we’ll use regular sets, because they’re faster and don’t require updating. The only operations performed on these sets are: create, get members and delete.
This is how we keep the many-to-many relationship between objects and responses:
* note that obj_2 is linked to both response 1 & 2
Recap time! Let’s see which are the keys stored within Redis:
Where’s my response?
To get it from the cache, we must follow these simple steps:
- Compute the request hash (in our example c0fee2aab4914a5d1d0d31115addc9b7)
- Get current response version from Redis (get c0fee2aab4914a5d1d0d31115addc9b7)
- Compute response hash (c0fee2aab4914a5d1d0d31115addc9b7:1)
- Get the cached JSON response from cache (get c0fee2aab4914a5d1d0d31115addc9b7:1)
Cache invalidation: the right way
Two major events will trigger response invalidation: object changed in the storage (SQL, whatever) and response TTL has expired. We’ll handle them separately up to a point, then we’ll use a common method to perform the actual cache removal.
Manual invalidation a.k.a. changed object
Let’s say characteristic_id with id 4 has changed in storage. We’ll get all response UIDs linked with this object and perform a DEL on each one.
We have removed all response UIDs linked with characteristic_id 4 (in our case, just one). All well and good, but wait! What happens to all the other objects linked with this response UID? Poor characteristic_id 1, 2 and 3 (not to mention template_id 1) now have orphan links to a non-existing response UID! With all the orphan links left behind, the Redis memory is doomed to explode! This is where the inverse relationship comes under the spotlight: we need to pick each object from the response UID we just deleted and also remove their response link.
Since we’re dealing with millions of records at eMAG, this operation is very intensive and must be properly handled with a queuing / distributed system. We’ll use the flexible Redis queue-like implementation of lists, which supplies the consumer with removal requests. Right after deleting the response UID, we run:
queue_invalid_responses is a Redis list and we rpush (right push) elements to the end of the list. The consumer needs to get the earliest element on this list and do the caching clean-up for the related objects. The consumer will get the first removal request with:
As you can see, line 2 contains the exact response UID that we deleted from the cache earlier. Now it’s time to get all the related objects:
And delete their relation with c0fee2aab4914a5d1d0d31115addc9b7:1:
We did a good job with the manual invalidation, but what happens with the expired response UIDs, the ones with the SETEX command? Redis is not a caching Swiss army knife, but it still has some neat features we must use: the PUB / SUB pattern, along with its channels, subscribers and publishers. Channels are broadcasters and all subscribers connected to a channel will receive all its messages. Basically, when a TTL expires for a key, Redis PUBlishes a message on a key-related channel:
- __keyspace@*__:<key> which broadcasts the event affecting a key
- __keyevent@*__:<event> which broadcasts the key affected by an event
If you want to check if a specific event was triggered on all keys, you’ll need to subscribe to the __keyevent channel, and if you want to check all the events that are affecting a key, you’ll have to subscribe to the __keyspace channel. Check http://redis.io/topics/notifications for more information about setting-up Redis notifications.
In order to receive a message when a key expires, we’re going to subscribe to the __keyevent channel:
- (integer) 1
The last line contains our beloved, recently expired, response UID. We can use the same list queue and consumer to perform the cache clean-up, the same way we did with the manual invalidation:
Although Redis is best known for its excellent caching application, it holds more power and speed than you can imagine. If you’re not quite familiar with Redis, building a caching system is a good way to get comfortable with it. That’s why I hope this article gave you another reason to use it and will prove helpful in your Redis endeavor.