Synchronisation scenarios with RabbitMQ and Magnolia CMS

In the previous post I went through the need of having an asynchronous message based publication or activation. In this post I will speak about the other benefits this system has in terms of backup/restore and re-synchronisation. The latest post including the maven artifact, installation and compliance with workflow and versionning will come in a later post.

Basic activation with automatic re-synchronisation.

The most obvious way is to configure a fanout exchange and bind queues to it. A fanout exchange simply duplicates the message to all bounded queues. For instance, an exchange that will have three queues bound to it. This means three public instances can be activated with it.

When the author publishes a message, the message is duplicated amongst all bound queues, consumers on the public instances will consume the activation messages.

Re-synchronisation after server crash.

When enabling acknowledgment for the consumer, the non acknowledged messages are re-queued when the consumer passes away. This is especially useful for keeping synchronization between the different instances.

Three nodes are sent from author to exchange:

Little colored dots are messages containing jcr node info

Pub1 consumes first message but crashes before sending acknowledgement. The message and it’s contained node could not be saved.

Public 1 server crashes before being able to consume the message

Other public instances continue consuming and activating, when pub1 starts up again the un-acknowledged message is re-queued with respect for order, this is very important since this message could contain the parent node of the next ones.

Public 1 restarts and gets all messages it missed

Adding new instances

Imagine we want to scale up and add new public instances. The following setup could be used. The basic idea is to keep always a spare queue without consumer connected to the exchange.

The spare queue without consumer will store all activation messages until a new consumer connects. The newly created instance will be created with the same initial data as pub1.

Pub2 is created with the same dataset as pub1. Before pub2 starts consuming a new queue is created.

Once the remaining messages are consumed a backup is created. This will be served as new state for new instances.

This is the moment as well that the public instance can be added to the load balancer’s pool. Activation can continue and spare queue pub_3 is filled with new messages that will present the diff between pub2 and newest content.

Standard backup’s

Regular backups are needed since, otherwise the queue will become to big. Whenever you do a backup on the instances, the spare queue can be emptied, since new instances will be created with this dataset.

Guaranteeing simultaneous delivery

This is probably the most interesting aspect of this kind of setups, since we need the instances to contain the same content at the same time. We need some way to guarantee or at least to know the state of each instance.

To be able to know the state of each instance I added a sequence number to each message and a send date, when this message is consumed by the public instance it will increment a sequence number and set a insert date in a place on the repository. The sequence number on the author instance and the public instance should be the same. This mechanism combined with the acknowledgment protocol of RabbitMQ should provide a fast way for guaranteeing state synchronicity.

For each public we know the latest node and time-stamp on which it was inserted

What can I do with this info ?

I created a REST service that will provide you the state for each instance. This will allow you to monitor the whole stack and take out ( of the load balancer pool) an instance if it is too much behind a the average of all sequence numbers. Alternatively you could use it as well to update dynamically the weight of a server in the load balancer algorithm, taking load away of the instance would allow to get it updated faster. While the others who will get more load, would update slower.

This mechanism would automatically balance the whole stack, in terms of data synchronicity. But a least connections algorithm of any good load balancer will eventually do the same job…