RabbitMQ : Message Ordering on Multiple Consumers
In RabbitMQ, we can have multiple consumers listening from one queue. This is good for fast procesing, but not so good for message ordering.
For the sake of simplicity, let’s say we have queue Q for invoice, with message comes in order : A1..B1..A2..B2..A3..B3..etc, where A and B is invoice number, and 1/2/3 is approval level.
In this case, each key (A / B) must be processed in order, e.g A1 processed before A2, and A2 before A3. Our application has two consumers that listen from Q, say X1 and X2. We set the prefetch count of RabbitMQ into 1, so each consumer can only has one unprocessed message. Suppose each message generated every 1 second.
Now here comes the problem. X1 process A1 in 10 second (that is, second 1 to second 11). X2 process is fast, only need 0.1 second to process each message, which means, X2 process A2 on second 3.1, and process A3 on second 5.1. See the problem? When A3 done processed, A1 still on progress by X1. This process finished on second 11, which means, the message that has been updated by 3rd process (A3) now revert back to 1st process (A1).
Some solutions for this:
- Use Kafka, it is -by default- has guaranteed order for each topic partition, and only one consumer per partition, so Kafka is born for this. Unfortunately, switching from RabbitMQ to Kafka is not that easy, especially if the code already runs on RabbitMQ. So, use this option only if you start new, and most of your messages must be processed in order.
- Use RabbitMQ with consistent hash exchange. Also the good news, since version 3.8, RabbitMQ has option for Single Active Consumer. If set, it makes sure only one consumer at a time consumes from the queue and fails over to another registered consumer in case the active one is cancelled or dies.
So, with option 2, we cannot have consumer X1 and X2 listening from queue Q. Only one consumer available. This means, we have guaranteed order.
Happy? I suppose not (yet).
If we only has one consumer that slower than publish rate, unprocessed messages will be accumulated. So we must have another way to make sure two things:
- Only one consumer per queue (use Single Active Consumer)
- Multiple consumers, with respect on point 1.
Well, meet the answer. We can achieve this by using consistent hash exchange. A hash function is a mathematical function that converts an input value into a unique but consistent value. So if we input some string, for example “A good day” and do a hash function on it, it will consistently return this string (on MD5 hash): 24CD6496C4660D1E62561C2BCF2030C2
RabbitMQ has a plugin for consistent hash exchange. Using that exchange, and one consumer per queue, we can achieve message order with multiple consumers. The hash exchange distributes routing keys among queues, instead of messages among queues. This means all messages with the same routing key will go the same queue. With consistent hash exchange, we use a unique identifier. We put that identifier as routing key when publishing message.
So, in above example, the candidate is A and B. When we send message to consistent hash exchange, RabbitMQ will calculate the hash value of this identifier, and send message to fixed queue, based on that hash value.
For example, based on hash value, all messages with identifier “A” will go to a Q.A while all messages with identifier “B” will go to Q.B.
So now consumer X1 (and only consumer X1) listens to Q.A, where it will always process invoice number “A”, and consumer X2 process invoice number “B”, with guaranteed order, under condition that only one consumer listen to one queue
To find out more detail (including complete source code in Java Spring), see here.