Azure Cosmos Integrated Cache and how it benefits ASOS Promotions
Our Promotions team at ASOS… We deal with hundreds of requests per second so we need to ensure that our services respond in the quickest manner for our customers (no one likes to be stuck when checking out an order).
In the Promotions team, we have found the following:
- Our data changes infrequently
- Our services consistently read from this persisted data
- This results in high database/storage costs
So, how do we provide lightning-fast speeds for our customers while still maintaining our costs? Our solution is Azure Cosmos integrated cache.
Azure Cosmos integrated cache is a new feature that recently dropped which can lower costs, optimise reads, reduce latency, all whilst handling the cache management for you. Before we dive deep into the technical bits, let us take a step back to go over the basics.
Okay, let’s set the scene
Azure Cosmos DB is a fully managed NoSQL database designed for modern apps. It provides high availability and throughput with impressive SLAs, transparent replication and a range of consistency options. For us here in the Promotions platform, it is our preferred product to use for storing our promotion and discount data.
This is great… but how much does it cost?
You pay for the throughput you provision and the storage you consume on an hourly basis. Your throughput can be increased or decreased based on the amount your services require. The cost of all database operations by Azure Cosmos DB is expressed as Request Units (RUs). In this case, all you need to know is that write operations require more RUs (cost more) than read operations which has the potential to get very expensive depending on your workload.
What if the data rarely changes?
So, we have established that read operations are less expensive in comparison to other more compute-heavy database queries, but all operations consume some sort of RUs. An approach we’ve used before at ASOS would be to create our own cache or use another product (for example Redis). While this will reduce the need to call Cosmos for data, you will have the overhead of managing this. This is where Cosmos integrated cache can give you a quick win.
This infrastructure is a mechanism to ease the number of requests made to the database. Caching is a way to achieve this; it is the process of storing and accessing data. Whenever a new request arrives, the requested data is searched first in a cache. A cache hit occurs when the requested data can be found in a cache.
In Azure Cosmos, there are two connection modes; Direct and Gateway. Direct mode is usually the quickest of the two because you directly connect to the Cosmos backend partitions. With the integrated cache, we have this new dedicated gateway. This dedicated gateway is a reserved space for you, and this is where the cache is stored. The cache works in a simple process where when requests go through, it will cache the response; this is also known as the read-through cache pattern, where if the data isn’t in the cache, it’s retrieved from the data store and added to the cache. The real advantage is that the next time the same request comes in then the cache will respond and there will be no need to consume any RUs for that request. Any updates to data in Cosmos will then also update the cache so all that is managed for you.
In the Promotions team at ASOS, we have an API which has a huge number of reads but very few writes, so that was a good candidate to run a comparison using the integrated cache. Below is a snippet of the performance test results.
We ran a performance test, one using integrated cache and the other without.
Average RPS: 100
Duration: 30 mins
Based on the performance test, we can see that even though the integrated cache is a little slower than the direct mode, the differences aren’t major and well within our teams SLA’s. The main difference is in the RU usage for our small performance test — we saw a massive 75% reduction in RUs, which is a notable cost saving. This type of cost saving allows ASOS to save money and distribute resources elsewhere, all in an effort to push the quality of service we provide to our customers.
The setup for the integrated cache was super easy (literally a checkbox), not having to manage the cache in terms of population and expiration really helps the team. For us, it’s a no-brainer — which is why you should cache your chips in and give it a go.
Daniel Agbodza — Software Engineer
Technology, sport and creative enthusiast