A simple architecture for cache or web-socket layers
This is how I designed state-full services using a very simple architecture to keep them in sync
Recently I've been playing with cache services and find it hard to keep in sync with the persistence layer, i.e. the database, plus, cache and thus state makes it problematic to scale, as instances will get different requests.
Another problem I faced is delivering messages to users connected with web-sockets, as you need to know which instance a given user is connected with, and make that instance send the message. Luckily the solution worked for both cases. I'll present you how I did this.
Model using MongoDB
The model using MongoDB deserves special attention because it has the ChangeStreams feature: ChangeStreams let you stream operations from the DB on real time. In other words it is a simple op log, but on steroids. You can read their docs [1,2] to get a better insight. What I like about using MongoDB for this is:
- Tiny latency
- Query flexibility, event selection
So the clients of ChangeStreams connect to MongoDB and listens to events, using this feature you can center your source of truth on the database, and update the instances of the cache service. This setup is great for cloud functions, or lambdas, and containers in general: on start they connect to the database and initiate the stream to stay updated, and when they die they are disconnected. Most cloud providers will let you create as many instances as you need.
On the left side you have your consumers, they need the cache service, and each request may not reach the same instance of the cache service, and of course, there may be many consumers. I wrote an implementation of this using Golang as programming language and Google Cloud Run as container orchestration and Cloud build for CI/CD, all necessary things to get started lies on this repository: cache-service-go.
Variation: readers and writers
One other way to organize your services is having different readers and writers.
Here writers can write one time to the database and the streams will fan out the new data to all connected instances. Thus the writer service does not need to care about updating caches or making requests for them. There is also the case to use a cache inside other services, and the same service can do other things, but take care to not write back to the database as you will create a loop. Always keep in mind the separation of concerns to help you design your services.
Message delivery solution
This is already very close to the solution for the message delivery problem, in that scenario the writer service is the API to send a message to a user or a group. That message is simply saved on the database, which is great, that service cares only about receiving users' requests and storing them. That message, saved on a specific collection, is then distributed to all connected instances of the message delivery service, which just distributes to the connected users to that instance which are addressed by the message.
This is a common scenario where some users on the same chat group are connected (via web-sockets) to different instances of the delivery service. To explain how the sent message will correctly be delivered for Users A and C, I will only give details on 3 and 4, as the rest is pretty easy to understand.
3. The received message is distributed to all instances listening to that event
4. As the instance of the message delivery service is only connected some users, it will check if a connected user is a recipient, and if that's true it will deliver the message. For users that are not recipients nothing is done. So each instance will only care about its connected users.
Of course the whole thing is much more complex, but from this organization is easy to know where your features should be implemented, you can grow in features without affecting too much the complexity of your services. That, for me, is an important part of a scalable system. I have a project of a social network designed on this architecture, it is functional but not yet ready for production, it needs some polish, I'll keep you posted.
You may me wondering: OK, but I don't use MongoDB. No problem! Just use Debezium. Use this platform to stream changes from a huge set of sources, as themselves say:
Debezium is an open source distributed platform for change data capture. Start it up, point it at your databases, and your apps can start responding to all of the inserts, updates, and deletes that other apps commit to your databases. Debezium is durable and fast, so your apps can respond quickly and never miss an event, even when things go wrong. Source: debezium.io
There is a nice article  explaining how to setup an run Debezium, take a look at it if you want to use this technology. So you can have this same architecture for other databases. Well that's it, I hope you find this useful, and if you have ideas, doubts, suggestions or just want to comment I'm all ears.