BigData & Web3. RabbitMQ High Availability and High Load. Part 4

Dmytro Nasyrov
Pharos Production
Published in
4 min readAug 3, 2024

Part 1 of the article is here

Part 2 of the article is here

Part 3 of the article is here

Policy-HA

Let’s go over the queue arguments that affect their behavior in the cluster. First, ha-mode specifies the replica assignment policy by nodes.

exactly — simply by the number of replicas (the master is also considered a replica, so 1 is zero replicas, 2 is one replica, etc.). The most recommended value (the number is specified via the ha-params argument);

all — simply for all cluster nodes;

nodes — list of nodes on which queue replicas need to be launched (the names are specified via the ha-params argument);

ha-sync-mode is responsible for adding replicas:

manual — manual control, mirrors are added only when the main queue is empty or by explicit initialization of the synchronization process. Default value.

automatic — fully automatic synchronization. There are cases when it can shoot you in the foot, but in 90% of cases, it is better to use it.

Logical diagram

Here is a cluster, but from the side of queues. The master queue is created where the channel was connected at the time of declaration (depending on the balancing settings). If you want to explicitly create a master on a specific node, you can connect to the web interface of this node and create a queue from there. For example, we created 4 queues — 1 and 4 on the first node, 2 on the second, and the 3rd queue — on the third node. Until we apply the policy, we will not have any replication or fault tolerance.

Add policy:

ha-mode: exactly

ha-sync-mode: Automatic

ha-params will be different for different queues:

  • for the first and second queues — ha-params: 3 — as we can see, two replicas have appeared (three nodes are busy servicing this queue);
  • for the 3rd queue ha-params: 2 — this means that only one replica is added;
  • for the 4th ha-params: 1, which means that there will be no replication.

And then let’s break everything. What will happen if, for example, the 1st node fails?

For the first queue, the selection of the master node is initiated. For example, let it be the 2nd node, but the 4th queue is logically unlucky — it will cease to be available:

Then 3rd node failed

For the third queue, 2 nodes are also re-elected as master. All three queues work on the second. Let’s say that we are lucky, and 1 node came to life itself

Queues 1, 2 and 3 will be replicated from the master and become replicas, queue 4 will come alive and remain the master on this node

After synchronization, the first two nodes will go into standard operation mode. And now we managed to launch the 3rd node, but the state on it (for example) was completely lost

Replication of queues 1 and 2 will start (since the required number of replicas for the third has already been reached)

And here is the result of two node shutdowns. Now queues 1, 2 and 3 are running on node 2 — this will not change by itself unless you explicitly switch the master to another node, or node 2 dies (sometimes the easiest way to rebalance is to reboot the most resilient node).

You can say Hi to us at Pharos Production — a software development company

https://pharosproduction.com

Follow our product Ludo — the reputational system of the Web3 world

https://ludo.com

--

--

Dmytro Nasyrov
Pharos Production

We build high-load software. Pharos Production founder and CTO.