Processing guarantees in Kafka

How do we guarantee all messages are processed?

How do we avoid or handle duplicate messages?

  • No guarantee — No explicit guarantee is provided, so consumers may process messages once, multiple times or never at all.
  • At most once — This is “best effort” delivery semantics. Consumers will receive and process messages exactly once or not at all.
  • At least once — Consumers will receive and process every message, but they may process the same message more than once.
  • Effectively once — Also contentiously known as exactly once, this promises consumers will process every message once.
  • Producer failure
  • Consumer publish remote call failure
  • Messaging system failure
  • Consumer processing failure

Kafka Consumer API

No guarantee

At most once

At least once

Effectively once

Idempotent writes


Effectively once production of data to Kafka


Effectively once consuming of data from Kafka

Effectively once in Kafka Streams

Side effects


Effectively once for the win?

Composing guarantees


Using a key to identify duplicates

Using a sequence number for deduplication

Using a datastore for deduplication





Spark Streaming