Cache invalidation using MQTT

Marcos Abel
Trabe
Published in
3 min readJul 9, 2018
Paw print “invalidation” using waves. Photo by me.

We started using MQTT a while back, in a “classic” scenario for MQTT: an IoT project. The main use case was as paradigmatic as it gets: small machines sending chunks of data to the MQTT server and big machines collecting and processing that data.

The publish/subscribe pattern used in MQTT allows us to decouple publishers and receivers and can be useful in a variety of scenarios beyond IoT.

Asynchronous interactions between distributed components of a system/platform are an interesting scenario were it can be used.

The problem we have

We have a web application written in Java. Our application is a heavy user of external services (mainly REST APIs) and those services are not the fastest ones. We use Ehcache to improve the performance of our system, but we cannot be very aggressive with the caches because it is important to have the latest data most of the time.

The data is owned by external services and we aren’t aware of any modifications. In this scenario we are limited to short lived defensive caches and almost every request will end up in a call to the external service as we can see in the following diagram.

Using just a defensive cache

MQTT to the rescue

What if we were to be notified of changes in data? The idea is simple: it is easy for the owner of the data to send a message when a piece of data changes…and it is also easy for our application to listen to those messages and invalidate caches when an invalidation is due.

We have a well scaled MQTT server available in our environment. We can take advantage of it to make components talk to each other.

In this new scenario our application will be notified any time the cached data is modified and we can be aggressive with caching (let’s say hours) and hit the slow backend service just every once in a while.

Cache Invalidation using Mqtt messages

Implementation

MQTT based “protocol”

We need to define the MQTT based interaction first. Let’s say that our piece of cacheable data is an invoice. We define the topic for invoices eviction as caches/invoices/{invoiceId}/evict.

The owner of the data will publish a message to that topic whenever an eviction is needed. All the applications that have invoices cached must subscribe to that topic and evict the caches when an eviction message is published.

Cache configuration

We use Spring’s Cacheable and CacheEvict annotations in our InvoiceService to manage the caching and the eviction:

The caching backend we use is Ehcache 2. If you need some help configuring Ehcache 2 in Spring, you can read this post. Once you have your global Ehcache configuration in place, you just need to add the configuration for the specific caches you need.

We are evicting “on demand” so we want our caches to have long TTLs. Long really depends on the specific scenario…let’s asume that 1h is long enough for us:

Listening to the topic

We need to listen to the topic (caches/invoices/{invoiceId}/evict) and evict our cache when an eviction message arrives.

For this example we will use a simple MqttComponent (pure paho implementation):

Using this component, we can subscribe to our topic of interest and evict the cache when a message arrives:

This approach scales well: every node will subscribe to the eviction topic and evict its caches when the eviction messages arrive.

Wrapping up

Ehcache + MQTT is not the best solution for distributed cache management and it’s not a replacement for a dedicated distributed cache system. But if you don’t have access to such a system and you have an MQTT server available, it can be a viable solution for managing on-demand invalidations.

--

--