Rabbitmq and Kafka, What is that?

Aparna Chinnaiah
Geek Culture
Published in
5 min readSep 29, 2021
Rabbitmq and kafka, what it is?

Software applications can be synced and scaled using message queues. Also these queues makes the asynchronous communication possible between two systems and decouples the software applications which make it easy for scaling process.

Before diving deeper into the topic let’s understand what is a message queue and how it is used in real project to scale the application?

The message queue is made up of a producer, a broker and a consumer. The producer is the client application that generates and produce the message to the broker. The broker receives, stores the message in the queue and waits for the consumer to connect and consume it.

To illustrate this let’s assume a web application that lets user to submit the information, process the information and generate pdf and send back to the user as a email. The web application act as a client and submit the information to broker which places into the queue. The consumer retrieve and process the information, generates the pdf and send the email back to the user. While messages are waiting in the queue and when processing the email, producer continues to queue up new messages. In this scenario by using the message broker we can scale up the pdf generation and email process by connecting more consumers to process the pdf generation and send the email.

Rabbitmq:

RabbitMQ is the most widely deployed open source message broker.

Rabbitmq — message broker

Messages are not published directly to a queue. Instead, the producer sends messages to an exchange. Exchanges are message routing agents which is responsible for routing the messages to different queues with the help of header attributes, bindings, and routing keys.

A binding is a “bridge” that you set up to bind a queue to an exchange.

The routing key is a message attribute the exchange looks at when deciding how to route the message to queues (depending on exchange type).

In RabbitMQ, there are four different types of exchanges that route the message differently using different parameters and bindings setups. Clients can create their own exchanges or use the predefined default exchanges which are created when the server starts for the first time.

RabbitMQ’s smart broker does the job of message delivery instead of the consumer. Generally, messages are fetched in batch transactions. Several messages are read together at once. A limit can be set on the number of messages that can be taken up for batch fetching not to overwhelm consumers.

In RabbitMQ, messages are stored until a consumer connects and retrieves a message off the queue. The client can either acknowledge the message when it receives it or when the client has completely processed the message. In either case, once the message is acknowledged, it’s removed from the queue.

A RabbitMQ client can also negative acknowledgement a message when it fails to handle the message in such case the message will be returned to the queue as a new message.

RabbitMQ provides the capability to assign priority to messages being sent in by the consumer. In such cases, a priority queue is maintained, and the message is enqueued accordingly.

Kafka:

Event streaming is the practice of capturing data in real-time from event sources like databases, sensors, mobile devices, cloud services, and software applications in the form of streams of events. Kafka is very useful for stream processing.

A message queue is a queue in RabbitMQ, and this “queue” in Kafka is referred to as a log. A message in Kafka is often called a record. When I write about a topic in Kafka, you can think of it as a categorization inside a message queue. Kafka topics are divided into partitions which contain records in an unchangeable sequence.

Kafka — message broker

Kafka does not support routing. Kafka topics are divided into partitions which contain messages in an unchangeable sequence. You can make use of consumer groups and persistent topics as a substitute for the routing in RabbitMQ, where you send all messages to one topic, but let your consumer groups subscribe from different offsets. Kafka maintains an offset for each message in a partition.

Kafka supports a pull mechanism where clients/consumers can pull data from the broker in batches. The client/consumer read messages from the broker and keep offset to track the current position of the counter inside the queue. After reading a message, the consumer increments its offset, and thus the counter is updated for subsequent retrieval.

You can create dynamic routing yourself with help of Kafka streams, where you dynamically route events to topics, but it’s not a default feature.

A message cannot be sent with a priority level, nor be delivered in priority order, in Kafka. All messages in Kafka are stored and delivered in the order in which they are received.

The message queue in Kafka is persistent. The data sent is stored until a specified retention period has passed, either a period of time or a size limit. The message stays in the queue until the retention period/size limit is exceeded, meaning the message is not removed once it’s consumed. Instead, it can be replayed or consumed multiple times, which is a setting that can be adjusted.

In Kafka, you can scale by adding more nodes to the cluster or by adding more partitions to topics. This is sometimes easier than to add CPU or memory into an existing machine like you have to do in RabbitMQ.

Usage:

If you want a simple/traditional pub-sub message broker, with system communication through channels/queues, and where retention and streaming is not a requirement the obvious choice is RabbitMQ, as it will most probably scale more than you will ever need it to scale.

If there is a requirement of analyzing/streaming data (tracking, ingestion, logging, security etc.) Kafka could be the better choice. Kafka is used in event-driven applications where data must flow between multiple components in the application.

Summary:

Both are popular message brokers. Finally, it depends upon the user requirements. Both fit to capture millions of messages, though they both have their own individual architecture.

--

--