Design Messaging Queue

6 min readSep 12, 2020

What is Messaging Queue ?

A message queue is a form of asynchronous service-to-service communication used in server-less and micro-services architectures. Messages are stored on the queue until they are processed and deleted

Messaging Queue is a combination of two words — Message & Queue where:

Message is the data to be sent from producer to consumer

Queue contains sequence of messages, sent between applications, awaiting their turn to be processed. Messages placed onto queue are stored until consumers retrieve them.

Let’s try to design the Messaging Queue from scratch. Let’s see how can we scale up the small messaging queue problem to handle thousands of request per second.

Consider a small example of Barber Shop.

Assumptions:

Barber is a Consumer.
Entry gate of a shop is a Producer.
People are Messages/ Records/ Packets.
Size of a queue is size of the waiting List in the Barber Shop.

LEVEL 1

There is a person who owns a Barber shop and he himself is a barber.

Configurations:

There is one Shop.
There is one Barber, i.e., one Consumer.
There is one Entry Gate, i.e., one Producer.
No. of Waiting chairs are 4, i.e., Queue size is 4.

Implementation:

To implement the above solution, we can use an Array data structure of fixed size, i.e., waiting list size.
Consumer is running in a while loop, consuming the messages one by one.
Producer is producing the messages one by one. In other words, people are entering the shop through the entry gate.
If the waiting list size is full, we can throw the exception to producer that no one can enter now.
If the waiting list size is zero, we can throw the exception to consumer that you cannot consume more.

LEVEL 2

The owner of the shop wants to expand. He hired five more barbers and opened a one more entry gate.

Configurations:

There is one Barber Shop.
There are five Barbers, i.e., five consumers.
There are two entry gates, i.e., two producers.
Waiting list has increased to 20, i.e., queue size is 20.

Implementation:

As there are multiple consumers and we want barbers to work in parallel, we will have to introduce multi-threading here.
Multiple threads will be running for Consumers to consume.
Two threads will be running for Producers to produce.
We can not directly use the Array data structure — because:

We do not want a same person to go to multiple Barbers.
We do not want multiple Barbers to pick the same person.

We need something synchronised here, i.e., something thread safe.
Either we can make the implementation thread safe by using locks — mutex, monitors, semaphores, etc Or we can use some Synchronised data structure for storage.

Learning:

Here we see that either the person can go to the Barber or a Barber can pick the person.

In messaging queue, there are two type of mechanism:

Push-based Mechanism- When a message is being pushed to the consumer.
Poll-based Mechanism- When a consumer polls the messages.

As we scale up, Multithreading is required for fast processing.
Synchronised data structure is required to handle the duplicates.

LEVEL 3

The owner wants to expand further. He wants to open multiple branches within the city.

Configurations:

There are four barber shops.
Each shop has multiple barbers, i.e., multiple consumers.
Each shop has one or more producers.
Waiting list size is in hundreds.

Implementation:

As there are multiple shops, multiple servers are required to handle the traffic.
We need something to divide the traffic among the shops. We need to introduce a Load Balancer here.

Load Balancer: It is used to divide the traffic, i.e., to balance the load on basis of some algorithms or some strategy defined by us.

We can define the strategy, the person who lives nearest to the particular shop should go to that shop.
If one of the shop shuts down for a day because of some issue like power cut, we do not want to lose the customers. To handle this situation, we need to introduce data replication.

Data Replication: Data is replicated to multiple places so that our system can handle failures.

We can store the customer’s contact information in multiple shops to handle any kind of failure. If we store every customer’s info to two more shops, then we can handle upto two failures at a time.

Learning:

As we scale up, multiple servers are required.
Load Balancer is required to divide traffic as per our need.
Data replication is required to handle failures.

LEVEL 4

The owner wants to expand further. He wants to open multiple branches across the world.

Configuration:

There are multiple branches across the world.
Each country can have branches in multiple cities.
Each city can have multiple branches.
Traffic would be in thousands per second across the world.

Implementation:

As there are multiple countries, multiple load balancers are required to divide the traffic.
We need to introduce multiple Availability Zones to handle the failures.

Availability Zones: Availability Zones are multiple, isolated locations within each Region.
Local Zones provide you the ability to place resources, such as compute and storage, in multiple locations closer to your end users

We need to introduce multiple Data regions, so the network bandwidth, network latency would be less and request will reach the respective server within milliseconds.

Each Amazon EC2 Region is designed to be isolated from the other Amazon EC2 Regions. This achieves the greatest possible fault tolerance and stability.

We need to introduce Cloud Metrics to manage the servers’ metrics like CPU usage, disk space usage, etc.
We need to introduce something for Security so that only person with a badge can enter a shop. For eg- Kerberos, IAM, etc

Challenges we faced as we scale:

Multi threading to process in parallel.
Synchronised data structure to handle duplicates.
Distributed data storage to handle failures.
Multiple Servers are required to handle the traffic.
Load Balancers are required to divide the traffic.
Data replication is required to handle the failures of multiple shut downs.
Availability Zones are required to handle complete area or city shut down.
Data regions are required to decrease the network latency.

Conclusion:

We have gone through how we can approach the design of messaging queue from scratch. From handling only 3 request per sec, how can we handle thousands of requests per second. We figured out the challenges we faced as we scaled up the problem.

We do not want to create our own messaging queue for every use case. Reason being:

There are so many solutions available in the market which are tested on thousands of customers and that are guaranteeing millions of req/sec even with the high performance, it would be a boilerplate for us to create what is already there in the market.

Some of the most used messaging queues are: Apache Kafka, AWS SQS, Rabbit MQ, IBM MQ, etc.

Our problem is reduced from “How can we implement our messaging queue?” to “What is the best solution available in the market for our service/project ?” . To identify this, we should know about the functionalities provided by the messaging queue and which one fits our use case.

Hope you all enjoyed reading it. Thanks for reading folks.!!!

UPDATE: I have published a story on — “Why Messaging Queue is required?”
Please give it a read here -

Why Messaging Queue?

Before we begin, if you haven’t read about design messaging queue yet, I recommend reading it on below link:

medium.com

Design Messaging Queue

What is Messaging Queue ?

Assumptions:

LEVEL 1

Configurations:

Implementation:

LEVEL 2

Configurations:

Implementation:

Learning:

LEVEL 3

Configurations:

Implementation:

Learning:

LEVEL 4

Configuration:

Implementation:

Challenges we faced as we scale:

Conclusion:

Why Messaging Queue?

Before we begin, if you haven’t read about design messaging queue yet, I recommend reading it on below link:

Written by mayank bansal