Overview of a Messaging System

Gaurav Ingalkar
8 min readApr 21, 2015

This blog is not about specific message broker but rather a generic take on a messaging systems. This article talks about different terminologies in more understandable manner irrespective of the vendor specific message broker that you use.

Well, broadly there are these three components involved in any Messaging System and these components are referred with the hundred different terms on the internet.

Message: Message can be represented in the form of a data structures like Strings, Arrays, Numbers, Objects or even it can be a binary data.
During the message transmission, the message is wrapped in header-body like structure. The body of the message contains the actual data to be sent to the receiver and header contains the metadata about what type of message it is.. Message can also carry commands and share the functionality of application.

Endpoints: Endpoints are known by various terms such as producer-consumer, publisher-subscriber or simply sender-receiver. They all possibly mean the same thing. Different communities may refer it differently (for example the JMS request/reply domain uses publisher-subscriber terms for the endpoints).

Endpoints connect to the message channels. These are the real bridges between the channel and an application, and this is normally written by the programmer who has the knowledge of both application and the message broker. Producer is the one who is responsible for generating and sending the message and consumers will receive it via channel and process the message.

Channel: Now this is the place where all your messages reside. The consumers of the messages will pick the messages from their registered channels. Channels are normally called as queues or topics depends on the design of the your Messaging System, but all are meant to be able to act as a mediator between producers and the consumers of the Message.
According to JMS specs there are two types of message channels available
they are :

  1. point to point channel
  2. publish subscribe channel

Point-to-point

The message channels are referred as a QUEUEs in P-to-P channel pattern. Most of the time it is one consumer registered for each of the queue and the message thrown on the queue is successfully received and processed by only by one consumer. But even if there are multiple consumers attached to the queue, the channel ensures that only consumer can get and process the message gracefully.

following diagram depicts single queue being used by one consumers.

Queues

Publish-subscribe

In publish subscribe pattern the message channels are known as TOPICs. The topics are likely to have many observers waiting for message to arrive. Once the message is placed on topic it is transmitted to all the consumers for further processing. Unless all of the subscribed consumers gets the message, the message is not considered as successfully consumed. This gives you the ability to utilize multiple consumer for the single Topic and get the different processing power for different consumers.

following diagram depicts the topics being used by multiple consumers.

publish subscribe channel

Two computers or programs communicating with each other in a reliable
fashion is the messaging.

Your system ensures that the message you want to send is not lost in the middle of the transmission. System which provides you the resources to make the messaging successful, such as data persistence, transmission medium, recovery mechanisms etc. is referred as a Messaging System.

Message systems integrate all the different components like broker, message store, channels together and provides the monitoring and managing capabilities to the user. Many people considers the broker itself as Messaging System but its just one part of it, though it is true that that it plays a vital role in the system.

When you try to connect to another computer you expect both the computers, their applications and the network to be up and running at the time of the delivery of a message. The real world is not always so reliable. When the sender wants to send the message to the receiver (These sender/receiver applications can be a web services or stand alone programs or apps running on the computers) simply may not be in the state to receive it or its possible your network is down while the message is under delivery.

Not everything is under your control and will not work the way you wanted it to work and this is where the messaging system comes into picture. Messaging System provides you the asynchronous way of communicating between the two different systems. This is one of the important reason of why you should be using a messaging system for your applications.

general messaging

Above diagram depicts following steps that messaging performs to make a message transmission successful —

  1. Create : Create the message, wraps the body of the message into frames. (header-body like structure).
  2. Transport : Producer puts the message onto the channel. The messaging system moves the message from the sender’s computer to the receivers computer.
  3. Receive : The receiver reads the message from the channel.
  4. Process — The receiver processes the message.

Messaging systems would usually try to resend failed messages. This makes message transmission reliable. As per the above diagram producer writes the message to the applications database and then puts it to the queue.

Networks are not reliable. Producers can also store message in database before transmitting it via network to the QUEUEs. Similarly, consumers can store message in databases before any further processing or transmission happens. In case the message transmission fails the messaging system simply retries message transmission. This is also called a store and forward pattern since it is ensured that the message is stored before forwarding it to the next destination for later retries.

How does messaging system knows that the message has failed to transmit ?

Reliable message deliveries is one of the guarantees that messaging systems provide. It does not guarantee when the message will be delivered though. The expiry checks and the acknowledgement mechanisms helps in understanding successful message delivery. Messaging System keep resending the message until it is successfully reached its destination. Every message that is sent across the network has to be acknowledged by the receiver. Sometimes expiry timestamps could cause issues if senders and receiver share different time zones but usually it is good idea to set a relative time till the message will be considered a useful, may be something like ‘30 mins senders after the creation of the message’. Usually these expiry message settings are configurable, but the values of expiry checks totally depends on the domain in which the messaging is used.

Imagine you are booking a flight ticket. Imagine you getting a acklowdgement after a week because one of the booking servers were down.

In this case I probably want the status of my ticket in seconds or minutes max. Which means the messaging system should not take more than a few minutes to reply with the booking status status. Message expiry is like expiry date on your general day to day grocery items. Nobody should sell the expired grocery items, If they are expired in the transit the buyers should not accept it. Similarly for any reason if producers finds messages expired, the message should not be put into queues. If the message is expired after transmission before it is received by the consumer. The message should not be processed. If anybody during the transmission finds the message expired it should be marked invalid or expired. The messaging system can be configured to redirect expired/invalid messages to the Dead letter Queue.

With retry mechanisms in place its possible that messages are received multiple times to the consumers for processing and there is no point in processing same message again and again. Imagine the consumer received the message successfully but failed to acknowledge. Producer will assume that the message transmission has failed since the acknowledgement is not received and producer will retransmit the message and in turn there will be duplicate messages transmitted to the consumer. To avoid this the consumer should be able to identify the message that has already been processed so that the messaging system can reduce some overhead on the consumers by not processing duplicates. It’s a good practice to keep your messages idempotent to avoid issues with duplicate message processing.

But how does your producer/consumers connects to message channel ??? Message broker provides an connectors to connect queues and topics. these connectors uses some reliable protocol (like tcp) internally to establish the communication. using this connector api you can create a connection and send your messages to the message channels. similarly consumers do listen to this connection for messages to pick it up from the channels.

Now when we say all the above steps are decoupled what it means is, in any case the above step failed due to the reasons like network failures, unavailability of the applications, It does not affect the delivery of the message unless the applications recovers from its failures. But what if messaging system fails itself ? what happens to the messages on channel then ? read on: ☺

Message persistent

The messaging system by default uses in-memory databases to store all of your messages for better performances. but if the messaging system
itself shuts down, will the messages from this database be recoverable ? the answer is NO. Unless you have configured your system to use permanent storage databases. Most of the time these are inbuilt file-based databases provided by the vendors. like activemq has a kahadb, leveldb as an file based databases for message persistence. If the reliability is the first item on your check list then you have no other choice than using these databases. As demonstrated in above diagram the system can also be configured with the relational databases to store your messages for better accessibility through sql but remember with more the reliability you are cutting the edges off on the performance. Message persistence is very specific to the vendors and it is totally depends on the decisions that specific vendors has taken to build it. some message brokers even has poor support for message persistence and mainly used for faster delivery of the message. example zeroMQ.

References

http://stackoverflow.com/questions/3015178/what-is-java-message-service-jms-for
http://docs.oracle.com/cd/E19340-01/820-6424/aerar/index.html
http://www.ibm.com/developerworks/java/tutorials/j-jms/j-jms-updated.html

--

--

Gaurav Ingalkar

Software developer - Ex-Thoughtworks | Consultant | Mentor | Tech leadership