Async REST services with Messaging

Sahan Vithanage
5 min readJun 24, 2022

--

Why Messaging System?

A Messaging System transfers data between applications so that applications don’t need to concern about how the data is shared and it can only focus on the data. Most of the time microservices are implemented based on REST. In order to implement REST, HTTP protocol is mostly used, but HTTP is a synchronous protocol where client needs to send a request and wait for the response. When it comes to asynchronous REST, there’s no need waiting for the response to continue working. Therefore even if one microservice experience a lag other services can continue work without trouble. Messaging is the best way to implement REST asynchronously. There are many popular message brokers but we mainly focus on RabbitMQ, Apache Kafka.

RabbitMQ a general purpose message broker with message queue management protocols. Advanced Message Queuing Protocol or AMQP is a most widely used protocol.

Smart broker-able to perform routing inside the broker

Kafka is considered as stream processing system. Unlike RabbitMQ, it can proceesed high volume stream of messages effectively.

Which one is better? It depends on the use-case.

Use-Case

We create a service where severel purchase orders are created and these orders need to be approved. Approval process may take several minutes. If we implement this synchronously a user have to wait until the approval process is completed which is a bad design. This process occurs within this service only, no other services are involved in approving the purchase orders. As this is only an ordinary message queue requirement, for a such system RabbitMQ queue managment solution is ideal.

Let’s assume the same use case but if purchase is more than 100_000, a different approval process is required. In RabbitMQ Producer publish the message and within the system Exchange receives and route messages to storage buffers called Queues.

RabbitMQ

There are different types of exchange are available. But let’s foucs on fanout and direct exchange. In fanout, exchange broadcasts all the message it receives to all the queues it knows. In direct exchange messages go to particular queue. In RabbitMQ, based on exchange configurations(message properties) broker decides which message goes to which queue.

On the other hand, Events(messages) are queued in Kafka Topics. Multiple consumers can subscribe to the same topic and consumers decide what to do with the message(whether deal with it or just discard it). In Kafka once the message is published all subscribers receive the message. Of course Topics can be divided into partitions to split data across different brokers but for a use-case like this RabbitMQ is most suitable.

Kafka

Now let’s assume the same service makes REST calls to other different services including one legacy system which doesn’t support API based integrations. Now when purchase order records in data base are changed CDC(Change Data Capture)is triggered. CDC turns database into a stream of data and each new transaction is delivered to Kafka in real time. Now other services who interested in this data modification can listen to Kafka and take actions accordingly. Here legacy systems can use Kafka connectors for API integrations. As this is a multi system integration, Kafka is ideal for this use-case.

Let’s take another use-case where we have to process 10 orders per minute. In RabbitMQ exchange receives the messages and send to queues. Let’s assume there’s only one consumer and it takes 20sec to process this message. So per a minute it can process 3 messages only. Then the queue is keep growing and eventually the last messages entering the queue get longer time to process. As a solution, more consumers can be introduced. By default RabbitMQ use Round-Robin message queue to dispatch messages evenly to the consumers where on average each consumer gets same number of messages. We can add more consumers to speed up the process.

Now let’s assume that each message takes different time to process. When the consumers consume messages one can takes messages which take longer time to process while another takes messages which take a smaller time to process. RabbitMQ doesn’t know anything about it and blindly dispatches messages. Because of this some consumers become idle while others are constantly busy. In order to avoid this RabbitMQ use fair dispatch. In fair dispatch RabbitMQ doesn’t dispatch new message to a consumer until it process and acknowledge the previous message(The property called prefetch_count is used to set the number of messages). This new message will dispatch to the next consumer which is not still busy.

In RabbitMQ once a message is delivered to the consumer it immediately marks the message for deletion. Suppose a consumer get killed while processing some messages, as these were not handled yet we’ll lose all those messages dispatched to that consumer. To avoid this RabbitMQ supports acknowledgments(ack). When a consumer receives or process a message it sends an ack to RabbitMQ to confirm that the message is safe to delete.

Now let’s see how to implement the same scenario using Kafka. Unlike RabbitMQ if we introduce more consumers, all the consumers will get the same messages. The solution is to use consumer group. For each topic kafka cluster can maintain partitions. Based on key(e.g orderId) messages are dispatched to relevant partition. Kafka gurantees that messages with the same key goes to the same partition. In Kafka order(sequence) within partition is guranteed but not within topic. Now we can assign consumers to listen to each partition. This is called consumer group configuration where we assign group_id to each consumer. Now Kafka dispatch messages to this consumer group using Round-Robin partitioner where messages are on average equally distributed to each consumer. But unlike RabbitMQ we can’t speed up the process by adding more consumers. When we have three partitions if we added 4 or more consumers those extra consumers will be idle(messages won’t be dispatched). So the limitaion in Kafka is number of consumers in a consumer group should be equal or less than the number of partitions in a topic. Kafka also retains all the messages whether or not they are consumed for a configurable period of time. Consumer controlls a property called offset in partition to keep track on messages. Consumer advance its offset linearly as it read messages. Consumer can decide the order as well.

After comparing both Kafka and RabbitMQ, in our scenario we can conclude that RabbitMQ is the best solution.

Image Source: https://kafka.apache.org/081/documentation.html

References

Kafka Demo: https://medium.com/@sahanmvs/kafka-manual-acknowledgments-24250e383f47

https://youtu.be/wP-FMNuO3D0

--

--