Caching in distributed systems

Noob Blogger
7 min readDec 21, 2022

--

Photo by Nick Pampoukidis on Unsplash

Explain caching like I’m 10!

Caching is the practice of storing frequently accessed data in a temporary storage location (the “cache”) in order to speed up subsequent access to the data. In a distributed system, caching can help to improve the performance and scalability of the system by reducing the number of requests made to the backend services.

There are several ways to achieve caching in a distributed system:

  1. Client-side caching: This involves storing data on the client side, such as in a web browser’s cache or in a mobile app’s local storage. This can reduce the number of requests made to the server, but it can also lead to stale data if the client cache is not properly invalidated when the data changes on the server.
  2. Server-side caching: This involves storing data on the server side, such as in a cache associated with a web server or application server. This can reduce the load on the backend services and improve the performance of the system.
  3. Distributed caching: This involves storing data across multiple servers or machines in a distributed manner, using a distributed cache system. This can help to scale the cache as the system grows, but it also adds complexity and requires additional infrastructure.
  4. In-memory caching: This involves storing data in the memory of a server or machine, rather than on disk. This can provide very fast access to the data, but it can also be volatile if the cache is not persisted to disk.
  5. Database caching: This involves storing data in a database or other persistent storage system and using a cache to store frequently accessed data in memory. This can provide a balance between fast access and durability, but it can also be more complex to set up and manage.
Photo by Gear5.8 by Roberto R. on Unsplash

Some common caching techniques include:

  1. Cache-aside: This involves loading data into the cache on demand, as needed by the client.
  2. Write-through: This involves updating the cache whenever the data is written to the backend store.
  3. Write-back: This involves updating the backend store whenever the data is written to the cache.
  4. Read-through: This involves loading data into the cache on demand, as needed by the client, and updating the backend store whenever the data is written to the cache.
  5. Refresh-ahead: This involves proactively loading data into the cache before it is needed, in anticipation of future requests.

Here are some common caching software options:

  1. Memcached: https://memcached.org/

Memcached is an open-source, in-memory cache that can be used to store frequently accessed data in memory. It is designed to be fast and lightweight, making it well-suited for high-concurrency environments. Memcached supports a number of features, including data eviction policies, multiple storage backends, and support for multiple languages.

2. Redis: https://redis.io/

Redis is an open-source, in-memory data store that can be used as a cache, a message broker, or a database. It supports a wide range of data structures, including strings, hashes, lists, sets, and sorted sets, making it a flexible choice for many types of applications. Redis is known for its fast performance and ability to scale horizontally.

3. Apache Cassandra: https://cassandra.apache.org/

Apache Cassandra is an open-source distributed database that is often used as a cache. It is designed to handle large amounts of data across multiple servers, with a focus on high availability and scalability. Cassandra offers a number of features, including tunable consistency, support for multiple data centers, and support for a wide range of programming languages.

4. Amazon ElastiCache: https://aws.amazon.com/elasticache/

Amazon ElastiCache is a managed cache service offered by Amazon Web Services (AWS). It is available in a number of flavors, including Memcached and Redis, and is designed to be easy to set up and scale. ElastiCache offers a number of features, including automatic failover, backup and restore, and integration with other AWS services.

5. Google Cloud Memorystore: https://cloud.google.com/memorystore/

Google Cloud Memorystore is a managed cache service offered by Google Cloud. It is available in a number of flavors, including Memcached and Redis, and is designed to be easy to set up and scale. Memorystore offers a number of features, including automatic failover, backup and restore, and integration with other Google Cloud services.

6. Hazelcast: https://hazelcast.com/

Hazelcast is an open-source, in-memory data grid that can be used as a cache. It is designed to support distributed computing and is known for its fast performance and ability to scale horizontally. Hazelcast offers a number of features, including distributed data structures, support for multiple languages, and integration with a number of platforms and frameworks.

7. Coherence: https://www.oracle.com/java/technologies/coherence.html

Coherence is a commercial in-memory data grid offered by Oracle. It is designed to support distributed computing and is known for its fast performance and ability to scale horizontally. Coherence offers a number of features, including distributed data structures, support for multiple languages, and integration with a number of platforms and frameworks.

It’s important to carefully consider which caching software is best suited for your specific system and use case, taking into account factors such as performance, scalability, features, and cost.

Photo by Raquel Martínez on Unsplash

Performance comparison: Redis & Memcached!

Redis and Memcached are both popular in-memory caching systems that can be used to store frequently accessed data in memory in order to improve the performance of a system. Both systems are known for their fast performance and ability to scale horizontally.

Here are some general points of comparison between Redis and Memcached:

  1. Data types: Redis supports a wider range of data types, including strings, hashes, lists, sets, and sorted sets, while Memcached only supports simple key-value pairs.
  2. Persistence: Redis supports data persistence, allowing you to store data to disk and recover it in the event of a crash or restart, while Memcached does not support data persistence.
  3. Transactions: Redis supports transactions, allowing you to execute a series of operations as a single atomic unit, while Memcached does not support transactions.
  4. Pub/sub: Redis supports a publish/subscribe messaging pattern, allowing you to send messages between processes, while Memcached does not support pub/sub.
  5. Multi-threading: Redis is single-threaded 😟 , meaning that it can only process one request at a time, while Memcached is multi-threaded, meaning that it can process multiple requests concurrently
  6. Performance: In general, Redis tends to have slightly better performance than Memcached for most types of operations, although the difference may not be significant in many cases.

Ultimately, the choice between Redis and Memcached will depend on the specific requirements and needs of your system. If you need support for data persistence, transactions, or pub/sub, Redis may be a better choice. If you only need simple key-value caching and don’t require these additional features, Memcached may be a good option.

Photo by Adi Goldstein on Unsplash

Implementing a basic inventory system using java and redis!!

Ho implement caching using Java and Redis to store inventory data, you will need to do the following:

  1. Install and set up Redis: Install and set up Redis on your system. This will typically involve downloading the Redis software, configuring the Redis configuration file, and starting the Redis server.
  2. Install a Java Redis client library: Install a Java client library that allows you to connect to and interact with a Redis server from a Java program. Some popular options include Jedis and Lettuce.
  3. Define an entity class for your inventory data: Define an entity class that represents your inventory data. This class should have fields for each piece of data that you want to store, such as item name, quantity, and price. You may also want to define any necessary getters and setters for the fields.
  4. Write a cache manager class: Write a cache manager class that handles the interaction with the Redis server. This class should have methods for storing and retrieving data from the cache, as well as for deleting and updating data as needed.

Here is an example of what the entity class and cache manager class might look like:

Entity class (InventoryItem.java):

public class InventoryItem {
private String itemName;
private int quantity;
private double price;

public InventoryItem(String itemName, int quantity, double price) {
this.itemName = itemName;
this.quantity = quantity;
this.price = price;
}

public String getItemName() {
return itemName;
}

public void setItemName(String itemName) {
this.itemName = itemName;
}

public int getQuantity() {
return quantity;
}

public void setQuantity(int quantity) {
this.quantity = quantity;
}

public double getPrice() {
return price;
}

public void setPrice(double price) {
this.price = price;
}
}

Cache manager class (InventoryCacheManager.java):

import redis.clients.jedis.Jedis;

public class InventoryCacheManager {
private Jedis jedis;

public InventoryCacheManager(Jedis jedis) {
this.jedis = jedis;
}

public void storeInventoryItem(InventoryItem item) {
jedis.hset("inventory", item.getItemName(), item.toString());
}

public InventoryItem getInventoryItem(String itemName) {
String itemString = jedis.hget("inventory", itemName);
if (itemString == null) {
return null;
}
String[] parts = itemString.split(",");
return new InventoryItem(parts[0], Integer.parseInt(parts[1]), Double.parseDouble(parts[2]));
}

public void deleteInventoryItem(String itemName) {
jedis.hdel("inventory", itemName);
}

public void updateInventoryItem(InventoryItem item) {
storeInventoryItem(item);
}
}
Photo by Eden Constantino on Unsplash

--

--

Noob Blogger

Hello! I am a blogger who is just starting out to share my thoughts and ideas. Please like, follow and comment for improvements. Add requests for new topics!