Kafka producer delivery semantics

Sylvester John
5 min readJun 30, 2019

--

This article is a continuation of part 1 Kafka technical overview and part 2 Kafka producer overview articles. Let’s look into different delivery semantics and how to achieve those using producer and broker properties.

Delivery semantics

Based on broker & producer configuration all three delivery semantics “at most once”, “at least once” and “exactly once” are supported.

Different delivery semantics

At most once

In at most once delivery semantics a message should be delivered maximum only once. It’s acceptable to lose a message rather than delivering a message twice in this semantic. Few use cases of at most once includes metrics collection, log collection and so on. Applications adopting at most semantics can easily achieve higher throughput and low latency.

At-most delivery semantic

At least once

In at least once delivery semantics it is acceptable to deliver a message more than once but no message should be lost. The producer ensures that all messages are delivered for sure even though it may result in message duplication. This is mostly preferred semantics out of all. Applications adopting at least once semantics may have moderate throughput and moderate latency.

At least once semantic retry

Exactly once

In exactly one delivery semantics a message must be delivered only once and no message should be lost. This is the most difficult delivery semantic of all. Applications adopting exactly once semantics may have lower throughput and higher latency compared other 2 semantics.

Exactly once delivery semantics

Delivery Semantics summary

The table below summarizes the behavior of all delivery semantics.

Delivery semantics summary

Producer delivery semantics

Different delivery semantics can be achieved in Kafka using Acks property of producer and min.insync.replica property of the broker (considered only when acks = all).

Acks = 0

When acks property is set to zero you get at most once delivery semantics. Kafka producer sends the record to the broker and doesn’t wait for any response. Messages once sent will not be retried in this setting. The producer uses “send and forget approach “with acks = 0.

Kafka producer Acks = 0

Data loss

In this mode, chances for data loss is high as the producer does not confirm the message was received by the broker. The message may not have even reached the broker or broker failure soon after message delivery can result in data loss.

Kafka producer Acks = 0 — data loss

Acks = 1

Kafka producer Acks = 1

When this property is set to 1 you can achieve at least once delivery semantics. Kafka producer sends the record to the broker and waits for a response from the broker. If no acknowledgment is received for the message sent, then the producer will retry sending the messages based on retry configuration. Retries property by default is 0 make sure this is set to desired number or Max.INT.

Kafka producer Acks = 1 — retry

Data loss

In this mode, chances for data is moderate as the producer confirms that the message was received by the broker (leader partition). As the replication of follower partition happens after the acknowledgment this may still result in data loss. For example, after sending the acknowledgment and before replication if the broker goes down this may result in data loss as the producer will not resend the message.

Kafka producer Acks = 1 — data loss

Acks = All

Kafka producer Acks = all

When acks property is set to all, you can achieve exactly once delivery semantics. Kafka producer sends the record to the broker and waits for a response from the broker. If no acknowledgment is received for the message sent, then the producer will retry sending the messages based on retry config n times. The broker sends acknowledgment only after replication based on min.insync.replica property.

For example, a topic may have a replication factor of 3 and min.insync.replica of 2. In this case, an acknowledgment will be sent after the second replication is complete. In order to achieve exactly once delivery semantics the broker has to be idempotent. Acks = all should be used in conjunction with min.insync.replicas.

Data loss

Kafka producer Acks = all — data loss

In this mode, chances for data loss is low as the producer confirms that the message was received by the broker (leader and follower partition) only after replication. As the replication of follower partition happens before the acknowledgment data loss chances are minimal. For example, before replication and sending acknowledgment if the broker goes down, the producer will not receive the acknowledgment and will send the message again to the newly elected leader partition.

Exception

When there are not enough nodes to replicate as per min.insync.replica property then the broker would return an exception instead of acknowledgment.

Kafka producer Acks = all — exception

Safe producer

In order to create a safe producer that ensures minimal data loss, use below producer properties.

Producer properties

  • Acks = all (default 1) — Ensures replication before acknowledgement
  • Retries = MAX_INT (default 0) — Retry in case of exceptions
  • Max.in.flight.requests.per.connection = 5 (default) — Parallel connections to broker

Broker properties

  • Min.insync.replicas = 2 (at least 2) — Ensures minimum In Sync replica (ISR).

Acks impact

The table below summarizes the impact of acks property on latency, throughput, and durability.

Kafka producer Acks property impact

Summary

Configure Kafka producer and broker to achieve desired delivery semantics based on following properties.

  • Acks
  • Retries
  • Max.in.flight.requests.per.connection
  • Min.insync.replicas

In part 4 of the series, let’s understand Kafka consumer, consumer group and how to achieve different Kafka consumer delivery semantics.

--

--