Message Brokers Short Note

Suraiya

4 min readOct 10, 2016

Why message brokers?

To ensure loose coupling, asynchronous/non-blocking communication between producer and consumer. Producer can be written as for example in one language and consumer can be written in a different language.
Also in case the consumer cannot process messages as fast as producer is producing; the messages can be buffered in queue.
By increasing number of worker processes we can speed up processing.

So packet drop would happen when the queue(s) is/are full (assuming there is no issue in the communication network).

By using message brokers we are basically trying to make the communication layer scalable.

Message bodies can be in different format (binary, XML, JSON etc.).

Notes on some of the different message brokers:

Kafka

Guarantees delivery of at least once; so in some cases same message may appear more than once.
Uses disk for persistence storage which slows the system down a bit but still it can handle 100k+/sec requests.
Performance is good but code that is in use to ensure cluster level high availability is not stable enough [2]. We know that when one virtual machine (VM) in a cluster (of VMs) is down then another replacement VM can be up but state management portion of the code is not robust to take full advantage of the replacement VM seamlessly.
Has the concept of topic.

Zookeeper is used with Kafka for various state management.

I have not used Kafka.

RabbitMQ

Guarantees delivery of message but may be out of order.
Can ensure high availability but lower throughput per broker like 20000 messages/sec.
Can be durable or transient.
Consumer acknowledges (depends on the setting though) receiving messages; if there is no acknowledgement then the messages are re- queued.

It is based on AMQP protocol.

Any broker server written in AMQP should be able to inter operate (which is lacking in the wide varieties of message broker server available). IBM MQ, Zero MQ, MSMQ, ActiveMQ servers are based on different protocols (without relying on a common standardized one) and cannot inter-operate.

AMQP protocol works on top of TCP. Using AMQP we create lightweight connection or channel on top of a heavyweight TCP connection. There is no limit on how many channels can be on a single TCP connection. The RabbitMQ channel concept is at communication level but the concept is very similar to lightweight thread model in Go - just happening at different layer/level and one is dealing with I/O and another is dealing with CPU.

Performance can be 10 times slower for durable/persistent messaging as compared to non-persistent messaging. [3] proposes using separate clusters of RabbitMQ — as for example one cluster for non persistent message and another cluster of RabbitMQ (with load balancing) for persistent messages to make sure that persistence messaging is not affecting the performance of non persistence messaging.

I used RabbitMQ as part of system performance improvement experiment and for integrating (loosely) different applications at UsedVictoria. To understand the concept and benefit of using it that time I relied on the content/documentation from RabbitMQ and Celery sites (original).

In future I shall elaborate the following:

Exchange — message entry point.

Different modes of routing — direct, fan-out, topic, header.

Queue — to hold messages.

Binding — rule driven connection between exchange and queue.

Routing Keys versus Binding Keys.

Publisher confirm/asynchronous transaction versus synchronous /blocking I/O transaction.

Erlang Cookie.

vhost.

Message body/payload.

ZeroMQ

No guaranteed delivery of message.
Can ensure high availability.
Does not have the concept of topic. The prefix in the otherwise opaque messages can be checked and worker can filter messages based on that.

I have never used it.

MSMQ

Message queue from Microsoft. By default this service is off. You have to turn it on before you would be able to use it.

We used this one at Isolation Network in the “Rights” ingestion system for digital content as well as for notifications. (From my memory) It has one serious limitation — the messages are saved in Kernel space and because of this the message size limit is much less flexible compared to the limit in other types of queues available in market. And Microsoft has no (as per the relevant discussion groups) plan on changing it any time soon. Using memory mapped files may be a solution while dealing with big messages but it may be even better to use other Message Queue based systems. Part of our email notification (on application error) at Isolation Network used message queue for accumulating and later on sending all those messages. Due to the message size limit we faced problems but the good thing is the effect was confined within the email handler or the application producing messages. Other apps were unaffected or isolated.

ActiveMQ — (yet to add note on features)

Used it few years ago at Eightfold Logic. It is written in Java. As usual we can use any language while writing client. We used STOMP client in Perl.

[1] When we are using polyglot persistence (using different databases depending on what best suit the type of tasks/queries we have at hand. We have to copy the same data differently and in different models (you see we have denormalization). To sync different databases we can use RabbitMQ. How? Here is how— when an action is triggered in our system which would cause some changes in the persistent data; in such a case we can generate a message for Rabbit such that the message would be routed to multiple queues and the worker processes/consumers subscribing to those queues can pick those messages up, handle and save in a manner/style needed for corresponding databases.

Further Reading:

[1]https://blogs.vmware.com/vfabric/2013/01/messaging-architecture-using-rabbitmq-at-the-worlds-8th-largest-retailer.html

[2] https://tomasz.janczuk.org/2015/09/from-kafka-to-zeromq-for-log-aggregation.html

[3] The book “RabbitMQ in Action”

[4] The book “Instant RabbitMQ Messaging Application Development How-to” for anyone trying to get a quick overview. It does not cover areas like clustering and high availability.

Message Brokers Short Note

Written by Suraiya