PostgreSQL as a Message Broker to Apply a Reactive Caching Strategy

Published in

Geek Culture

7 min readJul 8, 2021

Introduction

We’ve been using database engines like PostgreSQL or the other as a building block to develop software that solves our daily friction problems. It could be a simple to-do list app or a spending-tracker app using those database engines to remember and store things.

I’d like to share the other use cases we can have using PostgreSQL to not only doing basic things we do but something beyond it which some of us may never explore through this article.

PostgreSQL as a Message Broker

Message broker is a software that enables us to do asynchronous communication between two or more parties. Parties here could be some microservice-based applications that listen to some topics to share and fetch some information they need and process it under their business logic.

There are a bunch of technologies that can act as a message broker we can find out there, such as

Redis Pub/Sub
RabbitMQ, and
Apache Kafka

I guess those three technologies on the list are the most common ones that pop up in everybody’s mind whenever they think of a message broker software. On the other way around, nobody would think PostgreSQL will suit the needs in the very first place.

Listen and Notify

PostgreSQL owns an API that resembles the message broker’s pub-sub pattern through LISTEN and NOTIFY commands.

We can use the NOTIFY command to publish a notification telling subscribers that an event has occurred in a channel. Meanwhile, for the LISTEN command we can use it to subscribe to a channel’s event and use the information in it to do something.

It’s just like any other pub-sub pattern that is common in the message broker softwares. To get clearer visualization about it, we can see how it looks like in the figure below.

It’s pretty straightforward to use the commands, right? 👀

# to subscribe to a channel (topic)LISTEN channel;# to publish an event to a channel (topic), payload is optionalNOTIFY channel, payload;

You can learn more about this here: LISTEN and NOTIFY.

Reactive Caching Strategy using PostgreSQL

This is the interesting part, we now know that PostgreSQL can do things like any other message broker softwares. The question is: what makes it has something to do with the caching strategy?

So before going into the chief topic, it’s better to have some idea and aligned definition about what does it actually mean here in a nutshell.

Caching Strategy

A caching strategy is a technique to maintain the data integration between the major data source and the caching system, and in what manner we can interact with the data later on.

There are several common caching strategy variants out there but here we’re only going to discuss some of the write variants. Let’s get to them.

Write Through Variant

Write through variant allows us to sync the data between the caching system and data source in an one go manner.

Application writes data to the cache, then
Application writes data to the database after done writing data to the cache

It is a pretty simple approach yet the drawback is very obvious, tho. This operation is costly because we have to maintain two network calls to the cache and database.

The write to the database could be potentially sluggish and considered as a heavy operation so that it will increase the network latency even more.

Write Behind Variant

Write Behind variant is actually similar to the Write Through one. The only difference between them is that this variant requires a message broker or a middleware to sync the data between the cache and the database asynchronously.

Application writes data to the cache
Application publishes a message that notifies the data has changed
The consumer application consumes the message, then
It writes data back to the database

This variant requires more steps to get done but we can benefit a thing from it, we can reduce the network latency through a message broker or a middleware in between to sync the data.

Writing a message to a message broker produces less latency rather than addressing data to the database directly.

👻 Reactive Cache Variant 👻

This variant is actually my own term. The reactive word in its name reflects the way how the cache will get updated. The idea is the update on the cache will only happen whenever the respective data in the database gets changed.

Here what’s happening in this variant:

Application only writes data to the database once
The database publishes a message to tell consumers that a data has changed
The consumer application consumes a message, then
It writes the data back to the cache

By using this approach, we can reduce the network latency in a very significant way.

Notice that we only do a single network call to the database, and then let it do its job to produce a message orchestrating the consumers to write the data back to the cache asynchronously. Fire and forget.

Not only that, we also don’t need any additional message broker software anymore to get it done. Let the PostgreSQL come into play through its Listen/Notify API.

Pretty cool, huh? 😎

Twitter Profile App as an Use Case

Let’s get dirty. All the things above sound promising, right? So, I’ve built a “semi” working application that implements that reactive caching strategy under the hood.

I’ve designed a Twitter-like application architecture to show a user profile page that contains information about the number of followings, followers, and likes counter.

Here’s how the architecture does look like.

Updating counters will take the benefit of the reactive cache strategy. Client can access /v1/profile/<username> endpoint to do that, and
To fetching counters, client can stream the /stream endpoint. It will connect with the endpoint through a Server-Sent Events Protocol to get realtime data from the server side. The server will push data to the client whenever changes happen in the cache.

So, there you go, you can see the demonstration below or go to this link for a hands-on interaction.

We can see there are two Chrome windows opened here. Each window renders our Twitter Profile app. Whenever the counter changes in a window, another window will get immediately updated and react to the changes right away. Pretty fast.

Notice that it’s not just the typical Javascript’s DOM manipulation. It’s calling an API to update the counter and to stream the counter. So, there should be a network latency, but it’s done before a blink of an eye 😜

Still be skeptical about it? We can speed up the update counter operation using a browser hack by executing this function on the browser’s console.

This is the result. Expect a frame rate drop, because it’s a GIF anyway. The original file showed much faster counter change.

The application can serve the request without any effort, it worked seamlessly 🐛

Conclusion

Hey, we have come to the conclusion section already. I hope by completing this read we can be more cognizant of the Postgres’ Listen/Notify API existence, and start thinking of what kind of other use cases we can apply using this technology.

In terms of the scalability and performance, I unfortunately didn’t explore it that far. If you perhaps want to build a simple application, and don’t want to over-engineer it, you can start consider using a familiar tech like PostgreSQL instead of using more advanced message broker software like Apache Kafka.

It’s super possible to use PostgreSQL as a message broker software as an alternative to the others.

Source Code

You can access all the source code demonstrated in this article through my Github repository here: https://github.com/parinpan/reactive-cache

Tech stack used: Python3, Flask Framework, PostgreSQL, and Redis.

Further Discussion

If you have any query or would like to discuss anything. I’m very open to it. Please reach me out through my social media accounts: