The Kafka API Battle: Producer vs Consumer vs Kafka Connect vs Kafka Streams vs KSQL !

Stéphane Maarek
Oct 29, 2018 · 6 min read

It’s actually really simple


Kafka is a beast to learn. Although the core of Kafka remains fairly stable over time, the frameworks around Kafka move at the speed of light.

A few years ago, Kafka was really simple to reason about: Producers & Consumers. Now we also have Kafka Connect, Kafka Streams and KSQL onto the mix. Do they replace the Consumer or Producer API or complement them?

Let’s make sense of it all!


One simple diagram

Using the right Kafka API

  • Kafka Producer API: Applications directly producing data (ex: clickstream, logs, IoT).
  • Kafka Connect Source API: Applications bridging between a datastore we don’t control and Kafka (ex: CDC, Postgres, MongoDB, Twitter, REST API).
  • Kafka Streams API / KSQL: Applications wanting to consume from Kafka and produce back into Kafka, also called stream processing. Use KSQL if you think you can write your real-time job as SQL-like, use Kafka Streams API if you think you’re going to need to write complex logic for your job.
  • Kafka Consumer API: Read a stream and perform real-time actions on it (e.g. send email…)
  • Kafka Connect Sink API: Read a stream and store it into a target store (ex: Kafka to S3, Kafka to HDFS, Kafka to PostgreSQL, Kafka to MongoDB, etc.)

You may want to do things differently, and it’s possible you will make it work. For example, Kafka Consumer and Kafka Connect Sink API are quite interchangeable, if you’re willing to write a lot of custom code for your needs.

Overall, the guidelines above should help you achieve the most efficient workflows with the least amount of code and frustration.


Kafka Producer API

Advantages

It is very common to use this kind of API in combination with a Proxy

Limitations

  • How to track the source offsets? (i.e. how to properly resume your producer if it was stopped)
  • How to distribute the load for your ETL across many producers?

For this, we’re much better off using the Kafka Connect Source API


Kafka Connect Source API

Advantages

Limitations

Kafka Consumer API

Advantages

Limitations

Kafka Connect Sink API

Advantages

Limitations

Kafka Streams API

Advantages

Limitations

KSQL

Advantages

Limitations


Wrapping up

Happy learning!

If you liked this article, don’t forget to clap and share!


Stéphane Maarek

Written by

Kafka Evangelist, Udemy Instructor, New Tech Hunter