Kafka Visualization and Work Progress

Hakan GÜVEZ
Huawei Developers
Published in
5 min readAug 24, 2023

Introduction

Hello all, I’m going to introduce “Kafka Visualization and Work Progress on Huawei Cloud”. I have divided this article that I will write into a few parts and after introducing them, we will actually do study cases.

1- How does Apache Kafka work?

2- What is a producer of Kafka?

3- What is a Kafka broker?

4- What is a Kafka consumer?

5- 3 Study Case of Kafka

Apache Kafka

1- How does Apache Kafka work?

A distributed streaming platform for creating real-time data pipelines is Apache Kafka. It enables permanent, orderly messaging between apps.

You need to be familiar with a few particular ideas in order to process data with Kafka.

A high-performance TCP network protocol is used by the servers and clients that make up the Kafka system to interact. This is how the communication appears: Topics are used to group messages, which is a fundamental Kafka abstraction. To a certain topic, the sender (producer) sends messages. The recipient (consumer) receives all communications from various senders on a particular topic. To achieve the great scalability that Kafka is known for, producers and consumers are completely decoupled in Kafka.

2- What is a producer of Kafka?

Client apps that send messages to Kafka are known as producers. Producers consistently post to a topic category. Any message from a certain subject sent by any producer you make will reach every listener of that topic.

A topic can be divided into sections both logically and physically. Each message is assigned a subject partition by a producer partitioner, and the producer then makes a request of the partition’s leader. The partition leader is required for all writes to the partition. If the leader fails, replicas might step in to complete the task and acknowledge the written message. The setup affects the number of replicas.

TL;DR

Since group coordination is not required, the Kafka producer is theoretically much simpler than the consumer; the producer setting affects overall throughput, durability, and delivery guarantees.

3-What is a Kafka broker?

A Kafka server that functions as part of a Kafka cluster is known as a Kafka broker, sometimes known as a Kafka node. Consumers can get messages from Kafka Broker by subject, partition, and offset after it has received them from producers and stored them on disk. Three or more Kafka brokers typically make up a Kafka cluster. Why? Having three copies of your data is best practice. You still have two brokers replicating data in case of failure, therefore you might be able to avoid losing any data.

4- What is a Kafka consumer?

You wish to apply data generated to Kafka topics in a variety of scenarios while integrating Apache Kafka into your system design. The Kafka consumer idea comes into play in this situation. Client apps that subscribe to one or more topics are considered consumers. You cannot rely on a single consumer to read and process all of the data; Kafka consumers let your application keep up with the rate of incoming messages. Customers of Kafka often belong to a “consumer group” where each customer receives messages from a separate subset of the topic’s partitions. The number of consumers can be scaled up to the total number of partitions in a given topic.

TL;DR

To grow data consumption from a Kafka topic, new consumers can be added to an existing consumer group. This way, each additional consumer in a group will only receive a portion of the messages. A new consumer group can be built for any application that needs all the messages from one or more Kafka topics.

5- 3 Study Case of Kafka

Let’s start 👇

Case Study for Apache Kafka

3 examples will be made and attention to detail will be paid in these examples:

Decide on the number of partitions that will be used to distribute data equitably. Try different broker counts, turning them on and off, and observing how the system changes. Data should always be stored in duplicates to prevent loss. By lengthening the consumption interval, the load can be simulated. In the end, check how offsets are committed and consider how adding or removing customers or brokers affects redelivery.

Study Case — 1:

Parameter of Configuration

Partitions: 2

Brokers: 2

Replication Factor: 2

Parameter of Producer

Producing Interval: 1 (tick)

Parameter of Consumer

Consumer-1:

Consume Interval (tick): 3

Commit Offset Interval (messages): 1

Consumer Group: A

✅ In the example with the specified structure, Kafka runs and can continue successfully like below:

Example-1

Study Case — 2:

Parameter of Configuration

Partitions: 2

Brokers: 2

Replication Factor: 3

Parameter of Producer

Producing Interval: 1 (tick)

Parameter of Consumer

Consume Interval (tick): 3

Commit Offset Interval (messages): 1

Consumer Group: A

❌ In the example with the specified structure, Kafka can not run successfully due to the Replication factor cannot be larger than brokers.

Study Case — 3:

Parameter of Configuration

Partitions: 1

Brokers: 5

Replication Factor: 1

Parameter of Producer

Producing Interval: 1 (tick)

Parameter of Consumer

Consumer-1:

Consume Interval (tick): 1

Commit Offset Interval (messages): 10

Consumer Group: A

Consumer-2:

Consume Interval (tick): 2

Commit Offset Interval (messages): 10

Consumer Group: A

Consumer-3:

Consume Interval (tick): 3

Commit Offset Interval (messages): 10

Consumer Group: A

✅ In the example with the specified structure, Kafka can run successfully. But only 1 Broker work and send message to the consumer group.

Example-3

Conclusion

There are several use cases of Kafka that show why we actually use Apache Kafka.

Kafka serves as a suitable replacement for a more conventional message broker. We may argue that Kafka is a good option for large-scale message processing applications due to its improved throughput, built-in partitioning, replication, and fault-tolerance.

Kafka discovers a promising application for operational monitoring data. It entails combining information from decentralized applications to create operational data streams that are centralized.

Kafka contributes super large stored log data, which means Kafka is an excellent backend for applications of event sourcing.

If you want to make a visual experiment in Kafka and see how it works, you can use this website

If you have any thoughts or suggestions please feel free to comment or if you want, you can reach me at guvezhakan@gmail.com, I will try to get back to you as soon as I can.

You can reach me through LinkedIn too.

Hit the clap button 👏👏👏 or share it ✍ if you like the post.

References

--

--