Guide to Confluent Certified Developer for Apache Kafka exam

Nilay Sundarkar
Geek Culture
Published in
5 min readNov 8, 2021

I recently passed the Confluent Certified Developer for Apache Kafka exam. In this post, I will share my journey and provide a guide for those who are interested in pursuing this certification.

Why should you pursue this certification?

Apache Kafka is one the most widely used messaging system in the data engineering field. It was developed at LinkedIn and enables designing massively scalable messaging systems. It is used by thousands of companies today for high performance data pipelines, analytics and data integration.

Confluent was formed by people who invented Kafka at LinkedIn. At the time of writing of this article, there is no other certification for Kafka that is as credible as that of Confluent.

Learning the internals of a distributed event streaming platform like Kafka not only adds value to ones profile but also provides design insights into the complex yet incredibly powerful and solid foundations of the platform.

Which areas does the exam test you on?

Kafka Internals

  • Broker
  • Consumer
  • Producer
  • Monitoring
  • Delivery Guarantees, Cluster internals, Replication concepts.

Kafka Streams

KSQL and ksqlDB

Confluent Schema Registry

Confluent Kafka Connectors

Do you need previous experience?

I have been working on Apache Kafka for well over 3 years now. In these three years, I have closely worked on complex scenarios while setting up consumers and producers. I have a deep understanding of libraries such as Spring Kafka, Spring Cloud streams and Kafka Streams and have solved some really complex design challenges using these libraries. I have seen firsthand how enterprise level security is setup on Kafka clusters. And I have troubleshooted some of the trickiest problems that some of our teams were facing. I have also tried out POCs on confluent schema registry and KSQL. Almost half of the questions, if not more, were a walk in the park for me due to the work experience I have had. Though the certificate will never match the knowledge you get working on real world design and issues, you can definitely acquire essential knowledge and potentially unlock great opportunities by starting your Kafka journey with the certification.

How hard or easy are the exam questions?

The exam tests your knowledge on concepts and scenario based questions. The scenario based questions are intentionally deceptive. For example, I got the same scenario (A set of brokers, set of consumers, set of consumer groups, set of producers, etc) for 10 different questions with 10 different issues or design considerations to answer for. Without hands on experience, such questions could be extremely hard to guess.

How much hands on practice do I need to get? Can I pass just with theoretical knowledge?

Personally, at a minimum, I would recommend trying some hands on for the below:

  • Understand how the Kafka Consumer works with respect to offset commits. Try out auto vs manual commits. When trying out manual commits, try out and understand differences between sync vs async commits.
  • Understand how the consumer group rebalance works in theory.
  • Understand consumer properties deeply, especially those that dictate timeout settings such as session timeout and heartbeat interval.
  • Understand how a Kafka Producer works. Try out the async vs sync mode of producing to a topic.
  • Understand how different producer properties such as linger.ms, request timeout, max byte size, etc affect the async producer behavior under low vs heavy load.
  • Try out blocking (sync) producers. Try out callbacks with async producers.

In addition to above, understand the concepts for the below very thoroughly :

  • Understand stateful vs stateless stream processing. The exam throws sample business scenarios and asks you to identify stateless vs stateful cases.
  • Understand the differences between hopping, sliding, tumbling and session windows in Kafka streams. There is one basic question around these for sure.
  • Understand the different joins one can do on KStreams, KTables and GlobalKtables and what the resulting output is.
  • Understand the broker internals such as leader election, failover, etc very well.
  • Have a good understanding of Kafka Connectors, Confluent Schema Registry and KSQL. There were less questions on these topics and most of them can be answered based on theoretical knowledge.
  • Understand how Kafka can be tuned to achieve exactly once semantics.
  • Understand ordering guarantees with respect to increase and decrease in the number of partitions of a topic.

What resources can you use to cover the exam topics?

  • [Reference] Apache Kafka and Confluent official documentation and Confluent blogs. These 2 web sites are an absolute gold mine of information, but could be very daunting to study end to end. I frequently consulted and read through different sections of the official site/blog. For the purpose of the exam, I treated these as more of a reference material.
  • [Book]Kafka Definitive Guide, 2nd edition. This book was released only yesterday. I had early access to this from my Oreilly account. Probably the best place to learn the internals of Kakfa. Spend enough time on this.
  • [Book]Mastering Kafka Streams and ksqlDB. At the time writing this blog, this book is being given for free by confluent. I skimmed through this one. But I liked the coverage given to both Kafka streams and ksqlDB. I am quite familiar with the Processor API and Streaming DSL of Kafka streams. But for someone who does not have much experience with these 2 topics, spending good time on the concepts discussed in this book will be a good idea.
  • [Courses] Stephane Maarek’s courses on Udemy for Kafka Connect and Confluent Schema Registry. I did not spend a lot of time on these 2 topics except for understanding the concepts very well. But the Udemy courses cover what is needed for the exam.
  • [Mock Exams] Stephane Maarek’s practice question papers on Udemy are a good resource. The questions he has on the practice exams are very close to what I faced in the real exam.

How to prepare for the exam, an opinion.

I passed the exam in a single attempt. The hands on experience I had and the hours I put into preparing for the exam using the resources above definitely helped in clearing the exam. But, what made the difference is my preparation strategy in the last 10 days before the exam. In those few days, I was through most of the literature and course material. I started taking the tests and spend a lot of time on the questions that I got wrong. I then revised only for those areas and repeated the whole cycle. This feedback loop of mock test into revisions into mock tests, greatly helped me answer all the questions very confidently on the exam.

In conclusion, one needs to prepare for the topics covered by the exam and also for how to beat the exam.

--

--

Nilay Sundarkar
Geek Culture

Principal Engineer at a multi-national, fortune 15 company. Backend developer with over 15 years of industry experience.