Synchronous Communication — Queries & Cache (Part 2/3)

This write-up explores the need for cache, challenges and strategies to keep it consistent with the data source.

Abhinav Kapoor

Published in

CodeX

4 min readNov 6, 2022

Why cache data?

The idea of a cache is to store a copy of the data either at a location closer to where it is needed or to keep the data in a faster data store which can serve queries quicker than reading from a database or it can even store the result of an expensive computing operation.

Apart from performance, a cache can also help with Availability — by becoming an alternate source if the origin is unavailable (this means we are favouring availability over consistency), Scalability — as it can take away the pressure from the origin by reducing contention, & perhaps reduce overall costs depending upon where in the infrastructure it is implemented (more on that in the next part).

Challenge in Caching Data

Cached data is a snapshot of the original data and with time as the updates happen to the origin, the cache can become inconsistent with the origin. Therefore, caching is effective & straightforward for data that is read often and updated seldom. However, with a data model that is bound to get updated, we have some considerations.

The strategy has to consider how fresh or stale data is acceptable, which leads to deciding on a cache expiration or invalidation strategy. If the data in the cache expires too soon, then every read request may be sent to the origin defeating the purpose of a cache. If it's too high the cache may be serving stale data. This consideration is specific to the application and its domain.

The rate of change of the source data, as well as the cache policy for refreshing data, will determine how inconsistent the data tends to be. https://aws.amazon.com/builders-library/caching-challenges-and-strategies/?did=ba_card&trk=ba_card

Strategies to Keep Cache Consistent with Origin (as much as possible)

Partial Caching — Instead of caching everything, only some fields are cached. These fields could be relatively stable and are seldom updated.
Side Cache (or Cache Aside) — The application keeps the cache up to date. Suitable when data in the cache has to be loaded on demand. Or when Inline Cache (read-through or write-through) operations are not supported. For more on this pattern https://learn.microsoft.com/en-us/azure/architecture/patterns/cache-aside

Cache Aside pattern — Image credit https://learn.microsoft.com/en-us/azure/architecture/patterns/cache-aside

3. Inline Cache — The cache forms part of the Data Access API and the cache is transparent to consumers of the Data Access API. Inline Cache can further be classified into the following categories

3.1 Read-through Cache — Every read request is looked up 1st in the cache, if it's not available in the cache (or has been invalidated), the cache reads it from the data source and updates itself for future lookups and returns the result.

3.2 Refresh Ahead Caching — To reduce the latency of a read-through cache, and to keep hot objects fresh in the cache, some caches may support loading data in advance. If the object is accessed after expiration, it's a synchronous call to read and present the result (like read-through). But if the object is accessed a configured time before expiration, the value is returned but the async read from the origin replenishes the cache with the latest value.

3.3 Write-through Cache — Every write is made to cache which updates itself and the source. Straight forward for server-side caching. It slows down the write operation is completed when the data is written to both, the cache & the origin.

3.4 Write-behind Cache — It's different from the write-through cache because it speeds up the write by writing to the origin asynchronously.

If a caching strategy is working or not can be determined by looking at the cache miss (when an object is not found in the cache and is read from the source) vs cache hit (when an object is found in the cache) matrices.

In the next part, I’ll write where can we apply caching and what benefits or implications it can have.

Link to the previous part covering — Queries, CQRS & application of CQRS in different contexts.

Credits —

Synchronous Communication — Queries & Cache (Part 2/3)

This write-up explores the need for cache, challenges and strategies to keep it consistent with the data source.

Why cache data?

Challenge in Caching Data

Strategies to Keep Cache Consistent with Origin (as much as possible)

Written by Abhinav Kapoor