Exploring Caching in Distributed Systems: Concepts and Practical Demonstration

Truong Bui
6 min readNov 17, 2023

--

source: https://unsplash.com/fr

Part of the Domain Knowledge Series: If you haven’t read my other articles yet, please refer to the following links:
1. Scaling SQL Read Operations: Effective Strategies and Detailed Examples
2. Load Balancing in Distributed Systems: Exploring Concepts and Practical Demonstration

Recently, I’ve been thinking a lot about improving my Domain Knowledge. In the software engineering world, having good Domain Knowledge is essential for becoming a better software engineer. Today, I will explore caching in distributed design and do a practical demonstration as part of my self-directed learning.

Table of Contents

  1. Concepts
  2. Practical Demonstration of Caching
    GitHub link provided at the end of this article for comprehensive details
  3. Time to test what we did

Concepts

What is Caching?

Caching is a technique that stores frequently accessed application data in a dedicated, faster memory layer. This strategy aims to optimize data retrieval times, increase throughput, and lower compute costs.

How does Caching work?

Let’s see how caching operates in real scenarios when a user asks for data:

  1. The system checks the cache memory for the requested data. If found (cache hit), it directly returns the data from the cache, which is much faster than accessing the main database.
  2. If the data is not in the cache (cache miss), the system retrieves it from the original database, which is a slower process.
  3. After fetching data from the database, the system stores it in the cache. This way, future requests for the same data can be served more swiftly.

The mentioned approach represents a frequently employed caching strategy. In this scenario, the cache and database operate independently, and the application code is tasked with managing operations on both. This approach is beneficial for systems with a high volume of read operations.

Other caching strategies include read-through, write-around, write-back, and write-through strategies, which you can find in the following sections.

References

Here, I’ve provided a concise overview of Caching in Distributed Systems.

Since there are so many excellent blogs already covering the topic extensively. I think I don’t need to write about it anymore. Instead of that, I’ll share the references I’ve gathered on this subject. Feel free to delve deeper into the world of caching through these resources.

  1. https://igotanoffer.com/blogs/tech/caching-system-design-interview
  2. https://www.enjoyalgorithms.com/blog/caching-system-design-concept
  3. https://www.educative.io/courses/grokking-modern-system-design-interview-for-engineers-managers/system-design-the-distributed-cache

Practical Demonstration of Caching

Scenario

We utilize Redis as a cache, storing frequently requested user items to minimize direct access to the slower underlying storage layer.

Prerequisites

  • Java 17
  • Maven Wrapper
  • Spring Boot 3+
  • Swagger (for testing purposes)
  • Docker runtime in advance (Docker Install)

Implementations

Caching Layer Architecture

We set up Lettuce as the Redis client to talk to the Redis Cluster. Then, we create two simple APIs to test this setup locally.

Setting up and testing the Redis Cluster locally follows a similar process to what I explained in a previous article. While I won’t go into all the details again, I’ll cover the new parts and provide clear explanations with reference links.

Previous article: Setting Up a Local Redis Cluster Using Testcontainers in Spring Boot,

The new parts in this practical demonstration include the creation of a custom serializer. This allows us to manually serialize objects using a chosen algorithm before saving them to Redis.

Compression/Decompression Algorithms
Several compression/decompression algorithms are available, including Snappy, Gzip, Lz4, and more. I’ve chosen Snappy, but feel free to explore and add other algorithms as well. I’ll leave that as homework for you.

@AllArgsConstructor(access = AccessLevel.PRIVATE)
public enum CompressionAlgorithm {
GZIP(
streamCompressor(GzipCompressorOutputStream::new),
streamDecompressor(GzipCompressorInputStream::new)),
SNAPPY(Snappy::compress, Snappy::uncompress);
private final CheckedFunction1<byte[], byte[]> compressor;
private final CheckedFunction1<byte[], byte[]> decompressor;
public byte[] compress(byte[] data) {
try {
return compressor.apply(data);
} catch (Throwable e) {
throw new RuntimeException("Couldn't compress using " + name(), e);
}
}

public byte[] decompress(byte[] data) {
try {
return decompressor.apply(data);
} catch (Throwable e) {
throw new RuntimeException("Couldn't compress using " + name(), e);
}
}

private static CheckedFunction1<byte[], byte[]> streamCompressor(
CheckedFunction1<OutputStream, OutputStream> newStream) {
return data -> {
try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
OutputStream compressedStream = newStream.apply(outputStream)) {
compressedStream.write(data);
compressedStream.close();
return outputStream.toByteArray();
}
};
}

private static CheckedFunction1<byte[], byte[]> streamDecompressor(
CheckedFunction1<InputStream, InputStream> newStream) {
return compressedData -> {
try (ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
InputStream compressedStream = newStream
.apply(new ByteArrayInputStream(compressedData))) {
IOUtils.copy(compressedStream, outputStream);
return outputStream.toByteArray();
}
};
}
}

Custom Serializer

@Slf4j
@RequiredArgsConstructor
public class CompressedRedisSerializer implements RedisSerializer<byte[]> {
private final CompressionAlgorithm compressionAlgorithm;
@Override
public @Nullable byte[] serialize(@Nullable byte[] data) throws SerializationException {
if (data == null) {
return null;
}
log.debug("Serialized Data Length: {} ", data.length);
return compressData(data);
}

@Override
public byte[] deserialize(byte[] data) throws SerializationException {
if (data == null) {
return null;
}
log.debug("Decompressed Data Length: {} ", data.length);
return decompressData(data);
}

private byte[] compressData(byte[] data) {
return compressionAlgorithm.compress(data);
}

private byte[] decompressData(byte[] compressedData) {
return compressionAlgorithm.decompress(compressedData);
}
}

Redis Template Configuration

@Bean
public RedisTemplate<String, byte[]> redisTemplate(
RedisConnectionFactory lettuceConnectionFactory, RedisProperties redisProperties) {
RedisTemplate<String, byte[]> redisTemplate = new RedisTemplate<>();
......
redisTemplate.setDefaultSerializer(
new CompressedRedisSerializer(redisProperties.getCompressionAlgorithm()));
......
return redisTemplate;
}

Usually, we can utilize new GenericJackson2JsonRedisSerializer(objectMapper) as the default serializer. But it's not recommended, because RedisTemplate needs information about each serialized class, but the standard objectMapper doesn’t provide this information. We should either use new GenericJackson2JsonSerializer() or manually serialize objects before saving to Redis.

For the rest of the implementations, refer to the GitHub link provided at the end of this article for comprehensive details

Time to test what we did

Now, everything is ready! 😎

To launch the application, run CacheAppRunner.main() method, it should run successfully on port 8080.

  • Try out POST: “/api/v1/items”, the message should be pushed into Redis.
  • Try out GET: “/api/v1/items/{itemId}” with itemId = 11111, Expected result:
  • Try out GET: “/api/v1/items/{itemId}” with a non-existing itemId in the cache. Expected result:
  • With a 1-minute Time To Live for cache records, after that duration, cache records expire, Try out GET: “/api/v1/items/{itemId}” with itemId = 11111, Expected result:

We have just explored the concept of distributed caching in system design and conducted a brief demonstration to observe its behavior.

Hope you can find something useful!

The completed source code can be found in this GitHub repository: https://github.com/buingoctruong/distributed-cache

Happy learning!

Bye!

--

--