Distributed event streaming — Kafka

Amit Singh Rathore

Published in

Geek Culture

7 min readOct 9, 2022

A framework for storing, reading, and analyzing streaming data

Kafka enables us the with three capabilities of a streaming platform:

Publishing (writing) and subscribing to (reading) streams of events
Storing streams of events durably and reliably
Processing streams of events as they occur

Features

Distributed
Resilient & Fault Tolerant
Horizontally scalable
High performance & low latency (Click to know why)
Schema agnostic storage
Polyglot (Java, Python, Go, and even REST Proxy)
Asynchronous processing & decoupling between producer & consumers
Persistent message store, buffer, bursty load, and consumer failures
Production tested, used by LinkedIn, Netflix, Airbnb, Walmart…

Use cases of Kafka

Command Query Responsibility Segregation (CQRS)
Complex Event Processing (CEP)
Staged Event-Driven Architecture (SEDA)
Log Shipping & Aggregation
Event Sourcing
Change Data Capture
Scalable message store
Request-reply Integration pattern
Materialized View Pattern

Simplified setup without ZK

Producers

Producers are services/applications that generate messages/data streams. They connect to a Kafka Cluster using a Broker or a Zookeeper URL. All events having the same key will be published in the same partition, which is a way to guarantee the ordering when consuming events. This is done using key hashing by murmur2 algorithm. A message produce by Producer will have following attributes:

Key — binary | Optional
Value — binary | Optional
Compression — none | gzip | snappy | zstd | lz4
Headers — Metadata of message | Key: Value | Optional
Partition & Offset
Timestamp

Consumers

Consumers are applications that consume messages/batch streams from a single partition.

Consumer Groups

A consumer group is a group of multiple consumers where each consumer present in a group reads data directly from the exclusive partitions. In case, the number of consumers is more than the number of partitions, some of the consumers will be in an inactive state.

Rebalance — Moving partition ownership from one consumer to another is called a rebalance.

A consumer subscribes to one or more topics.
A consumer can be part of only one consumer group.

Topic

Topics are logical separations for storing messages in a Kafka cluster. Kafka retains message records as logs. Offsets determine the relative positioning of logs for producers’/consumers consumption. These are stored in the Kafka internal topic __consumer_offsets. A topic can have zero or more consumers, as well as it can have zero or more producers.

Broker

Kafka clusters consist of multiple servers for high- availability needs, known as brokers. These are the actual machines and instances running the Kafka process. Each broker can host a set of partitions and handle requests to write and read events to these partitions. They also help in message data replicated across partitions.

Partition

Topics are subdivided into partitions, the smallest storage unit in the Kafka hierarchy. Partition is an ordered and immutable sequence of records. Inside each partition, each published event receives an identifier, the offset. The offset is used by the consumer to control which events have been already consumed and which event is the next one to consume. But the consuming order is only guaranteed in the same partition.

Partitions are a unit of parallelism for Kafka consumers.

Each partition stores log files replicated across nodes distributed into multiple brokers for fault tolerance. These log files are called segments. A segment is simply a collection of messages of a partition. Segment N contains the most recent records and Segment 1 contains the oldest retained records. The segment contains 1 GB of data (log.segment.bytes) or 1 week's worth of data (log.roll.ms or log.roll.hours) whichever is smaller. If this limit is reached, then the segment file is closed and a new segment file is created.

A partition has only one consumer per consumer group.
At any time a single broker will the leader for a partition. That broker will be responsible for receiving and serving data of that partition.

Note: A Kafka cluster should have less than 200k partitions while working with zookeeper.

Zookeeper

The zookeeper helps in the activities for the management of the Kafka cluster, such as coordinating between the brokers, choosing the leader partition for replication, etc.

Note: From Kafka 3.3.0 Apache Zookeeper dependency has been removed. Instead, Kafka now relies on an internal Raft quorum. With 3.3 onwards this feature is production ready.

Stream Processors

Streams consume an input stream from one or more topics and produce an output stream to one or more output topics, effectively transforming the input streams into output streams. This is mainly used to transform data mainly in ELT/ETL pipelines. Kafka provides an integrated Streams API library. This is similar to consumer API.

Connectors

Connectors are the reusable producers or consumers that connect Kafka topics to existing applications or data systems. For example, a connector to a relational database might capture every change to a table. Can be treated as a Data integration framework for Kafka. Since these are generally developed by system experts it encapsulates the operational aspects as well in the connector. Good reasons to use defined connectors are the following but not limited to these only:

Reduced development effort
Automatic Offset management & Recovering lost messages
Pluggable & Highly available and scalable
Standardization
Retrofitting of source/sink
Reduced maintenance

Schema Registry

In Kafka, producers and consumers do not communicate with each other directly, instead, they transfer data via Kafka topics. Producers must know how to serialize data and consumers must know how to deserialize it.

A schema Registry is an independent process that runs separately from the Kafka brokers. It stores and retrieves Avro, JSON, Protobuf schemas, etc. Using these schemas it performs verification for data serialization and deserialization.

Writing strategy

Ack — It denotes the number of brokers that must receive the record before we consider the write as successful. Valid values are 0, 1, and ALL.

0 → producer won’t wait for a response from the broker
1 → producer will consider write successful when the leader receives the record
2 →producer will consider the write successful when all of the in-sync replicas receive the record

Semantics

Exactly once → Since Kafka producer APIs are idempotent. Write will happen exactly once.

Storage

In most cases, data is consumed in a streaming fashion using tail reads. Tail reads leverage OS’s page cache to serve the data instead of disk reads. Older data is typically read from the disk for backfill or failure recovery purposes and is infrequent. In the tiered storage approach, the Kafka cluster is configured with two tiers of storage — local and remote.

Local
Remote (Not Production ready yet)

Configurations

Cluster/Broker

max.message.bytes
default.deserialization.exception.handler
errors.deadletterqueue.topic.name
min.insync.replicas
auto.create.topics.enable (make false in production)
num.replica.fetchers
default.replication.factor

Producer

bootstrap.servers
client.id
acks
retries
max.in.flight.requests.per.connection
batch.size
linger.ms
compression.type
enable.idempotence

Consumer

bootstrap.servers
client.id
group.id
session.timeout.ms
heartbeat.interval.ms
enable.auto.commit
auto.commit.interval.ms
auto.offset.reset
max.poll.interval.ms
max.poll.records
message.max.bytes
auto.offset.reset

Ports Used

Broker: 9092
Zookeeper: 2181
REST Proxy: 8082
Schema Registry: 8081
KSQL: 8088

Cross Cluster Replication (XDCR)

Kafka uses MirrorMaker2 for replication. MM2 is a combination of an Apache Kafka source connector and a sink connector.

Mirror Source Connector
Mirror CheckPoint Connector
Mirror Heartbeat Connector

MM2 automatically detects new topics and partitions, while also ensuring the topic configurations are synced between clusters.

Once all consumers are migrated then switch the producer to the new cluster

Other options are uReplicator (from Uber), Mirus (from Salesforce), Brooklin (from LinkedIn), and confluent Replicator (from Confluent, not opens sourced).

Note: Kafka is not a messaging service although it can provide a constrained queueing solution, it will be wise to go for NATS or RabbitMQ as a specialized solution

Observability & Monitoring (Console / UI)

There are many third-party, open-source, or commercial graphical tools that offer additional administrative and monitoring features.

Kafdrop
Lenses
AKHQ
Kafka UI
Console (Kowl)
Cruise Control UI
Confluent Control Center
Conduktor

Metrics to monitor

Consumer/Producer Log
Consumer/Producer Throughput
Consumer/producer Error Rate
End to End Latency
Broker CPU / Memory

AWS MSK (Managed Kafka)

Control Plane ( Zookeeper, Broker)

Broker and Zookeeper nodes are created in the Amazon Managed VPC in the service account. In our AWS account, network interfaces (ENIs) are created, these allow us to access broker and zookeeper nodes. MSK does not expose any public endpoint so any client needs to access MSK via a private connection (from the same VPC (where ENI is created), Peered VPC, or via Transit Gateway, and so on).

Data Plane (Producer & Consumer)

Kafka clients have to just focus on the Producers and Consumers and leave the setup and management details to the AWS platform!

Pricing

Compute hours → $0.75 per hour
Partition hours → $0.0015 per partition per hour
Storage → $0.10 per GiB-month
Data In → $0.10 per GiB
Data Out → $0.05 per GiB
Connectors → $0.11 per hour

Note: AWS offers Schema Registry using AWS Glue.

Redpanda (Alternative to Kafka)

C++ rewrite of Kafka on seastar framework
No JVM
Thread per core strategy
No Zk (Uses Raft algorithm)
Parallel commits
Inline WebAssembly Transformation (no need to use a separate processor like Flink). It uses V8 engine.
Shadow indexing
Async scheduling
Production-ready tiered storage (Memory → Disk →Object Storage)
Purpose-built RPC for message exchange
Fully compatible with Kafka API
Support for ARM

Other alternatives are Apache Pulsar, StreamSets, Flow, and Confluent.

Happy streaming !!