Photo by Ben White on Unsplash

Best books on Apache Kafka and Event Stream Processing

Linda Yong-Ju
Microservice Geeks
Published in
10 min readNov 28, 2021

--

No blog on microservices is complete without a discussion on service-to-service communication and event-driven messaging. In honour of this, we curated a list of the absolute best Apache Kafka and event stream processing books — based on Amazon user reviews, Goodreads ratings and, more importantly, our own reading experience. That’s right, our geeks have read each one of these books!

There are lots of free resources — articles, blogs and tutorials — available online for Apache Kafka and related technologies. And some of them are even useful! However, we found that the quality, accuracy and coverage of these resources can vary and is often a hit-and-miss affair. Sometimes it is better to do things the proven way, with a good book written by a recognised expert in their field. Few resources can match the in-depth, comprehensive detail of the books we have listed here today.

How we rate books? We start with a handful of books that have rated well over time on Amazon and Goodreads. We then ask our in-house geeks to do their own comprehensive review of each book and assign it a rating out of five. The geeks have to work through at least 50% of the book before they were allowed to review it. We then come up with an average of the three ratings and rank the books accordingly. Our reviews are entirely independent — we are unaffiliated with the authors and do not profit from the sales of their books. The authors are not involved at any point in this process.

1st Place — Effective Kafka: A Hands-On Guide to Building Robust and Scalable Event-Driven Applications

Author: Emil Koutanov

Publisher’s description:

The software architecture landscape has evolved dramatically over the past decade. Microservices have displaced monoliths. Data and applications are increasingly becoming distributed and decentralised. But composing disparate systems is a hard problem. More recently, software practitioners have been rapidly converging on event-driven architecture as a sustainable way of dealing with complexity — integrating systems without increasing their coupling.

In Effective Kafka, Emil Koutanov explores the fundamentals of Event-Driven Architecture — using Apache Kafka — the world’s most popular and supported open-source event streaming platform. The coverage is progressively delivered and carefully aimed at giving you a journey-like experience into becoming proficient with Apache Kafka and Event-Driven Architecture. The goal is to get you designing and building applications. And by the conclusion of this book, you will be a confident practitioner and a Kafka evangelist within your organisation — wielding the knowledge necessary to teach others.

Ratings:

Goodreads: 5.0/5

Amazon: 4.5/5

Geeks: 5.0/5

Overall: 4.83/5

Top Amazon review:

Effective Kafka is an excellent compendium for engineers and business using Kafka or for beginners who are learning the concepts.

I’ve worked with Kafka heavily in the past, and I was blown away by the sheer amount of knowledge that’s been poured into the book. Since reading, my experience with debugging or understanding Kafka concepts, have switched from Googling various Stack Overflow answers or trawling through Confluent articles, to only needing to look up the details in the book, as Emil can explain these complicated topics in a very detailed but easy to digest way.

Geek review:

Effective Kafka is a thoroughly enlightening literary experience, underscored by its masterful use of language, a well-thought-out structure with progressive delivery of technical content and incremental build-up of complexity. The first few chapters address a beginner audience, with gradual inclusion of intermediate and advanced topics as the reader moves through the chapters. To the best of our knowledge, all content areas have been covered in detail; the author routinely demonstrating remarkable depth and breadth of knowledge. The book contains lots of code and configuration examples and has an excellent combination of hands-on practical, as well as theoretical knowledge. In addition to the architectural and developer-centric aspects of Kafka, the book also covers operational areas and security. Overall, Effective Kafka is the highest-rated and easily the best value for money book in our review line up — a yardstick by which other books should be measured.

Get Effective Kafka on the author’s website.

2nd Place — Kafka Streams in Action: Real-time apps and microservices with the Kafka Streams API

Author: Bill Bejeck

Publisher’s description:

Kafka Streams in Action teaches you to implement stream processing within the Kafka platform. In this easy-to-follow book, you’ll explore real-world examples to collect, transform, and aggregate data, work with multiple processors, and handle real-time events. You’ll even dive into streaming SQL with KSQL! Practical to the very end, it finishes with testing and operational aspects, such as monitoring and debugging.

Ratings:

Amazon: 4.4/5

Goodreads: 4.0/5

Geeks: 4.0/5

Overall: 4.13/5

Top Amazon review:

Quite dry, but otherwise nearly flawless — I have nearly zero Kafka Streams prior experience (though I need “core” Kafka & similar CEP engines like Apache Storm), but I was able to easily follow both less & more advanced concepts. Only the chapter about KTables could use more illustrations to grasp the concepts (especially mixing paradigms of streams & tables), but it was still manageable.

Drawbacks? As with many “narrowly-specialized” tech books — it will probably age quite fast, so you better catch it quickly, while it’s still hot. Having a prior knowledge in Kafka and/or EDA is helpful, yet not required (the book covers basics of Kafka). There’s not much on using non-Java clients, but frankly — I don’t find it an issue at all.

Geek review:

Kafka Streams in Action represents a niche offering with a focus on event streaming as opposed to a detailed guide to Apache Kafka. The book opens with a history of data processing and reminisces on the evolution towards stream processing. It then proceeds to cover Apache Kafka from an introductory perspective to lay the foundation for the rest of the book. The Kafka Streams library is given thorough coverage and the book does a reasonable job at explaining the relationship between streams and materialised tables (a.k.a. the stream-table duality). The quality of examples is spot on and we found the progression of the book to be mellow and quite comfortable on the reader, with timely introduction of concepts and good use of callouts. We observed some minor issues with code formatting, but this did not stop us from progressing through the examples. The depth of coverage varies depending on the subject, with the book skimming on detail in several areas. Overall, this book represents good value for money and should serve well as an introductory text on event stream processing in the context of Apache Kafka.

Get Kafka Streams in Action on the Manning website.

3rd Place — Kafka: The Definitive Guide: Real-Time Data and Stream Processing at Scale

Authors: Neha Narkhede, Gwen Shapira, Todd Palino

Publisher’s description:

If you’re an application architect, developer, or production engineer new to Apache Kafka, this practical guide shows you how to use this open source streaming platform to handle real-time data feeds. Engineers from Confluent and LinkedIn who are responsible for developing Kafka explain how to deploy production Kafka clusters, write reliable event-driven microservices, and build scalable stream-processing applications with this platform. Through detailed examples, you’ll learn Kafka’s design principles, reliability guarantees, key APIs, and architecture details, including the replication protocol, the controller, and the storage layer.

Ratings:

Goodreads: 4.1/5

Amazon: 4.4/5

Geeks: 3.5/5

Overall: 4.00/5

Top Amazon review:

This book provides an easily digestible breakdown to Kafka for a complete novice as myself. I am an Android app developer with plans of incorporating Kafka both for realtime interactive (feedback-loop-type) user-experience and for offline message processing. And this book fits the bill as far as getting me primed for developing this type of application.

The book covers the very basic stuff like installation to various use cases for full-blown deployment. The instructions provided assume you’re using Linux, which is the recommended Operating System where Kafka is usually run on. However, you can run Kafka on Windows, Mac OS, or any other OS that is able to run the Java environment.

No book is complete in helping every aspect of deployment in the myriad use cases Kafka is employed for, but as a primer and a starting point for a novice, this book is very invaluable in helping that novice get a working model (or maybe a semblance of a working model) up and running. You will do well to also consult the wealth of resource available online for setting up or troubleshooting Kafka (e.g. on the Apache site, StackOverflow, and many others).

Geek review:

Kafka: The Definitive Guide is a well-respected book from the maintainers of the Apache Kafka project. The book explains the core concepts well and progresses from beginner to intermediate skill set. The examples in the book are excellent. Our impression of the book is that it is best suited for absolute beginners as well as practitioners who want to raise to an intermediate level. This would not be a book suited for an advanced audience, as most of the detail necessary for this level of comprehension is left to the online documentation. The structure of the book could do with a bit of improvement, giving the distinct feeling that it was written by different authors (as indeed it was), each contributing a parcelling of chapters, at times with little cohesion between them. This book is also somewhat dated now, with several key Kafka features not making it into the manuscript at the time of publication. Overall, we consider it a fair value for money proposition for a beginner to intermediate audience, let down by its slightly outdated content and a less-than-perfect flow.

Get Kafka: The Definitive Guide on the O’Reilly website.

4th Place — Event Streams in Action: Real-time event system with Kafka and Kinesis

Authors: Alexander Dean, Valentin Crettaz

Publisher’s description:

Event Streams in Action teaches you techniques for aggregating, storing, and processing event streams using the unified log processing pattern. In this hands-on guide, you’ll discover important application designs like the lambda architecture, stream aggregation, and event reprocessing. You’ll also explore scaling, resiliency, advanced stream patterns, and much more! By the time you’re finished, you’ll be designing large-scale data-driven applications that are easier to build, deploy, and maintain.

Ratings:

Amazon: 4.7/5

Goodreads: 3.3/5

Geeks: 3.0/5

Overall: 3.67/5

Top Amazon review:

This is one of the most comprehensive, understandable introductions to the complex domain of event streaming technologies that I’ve come across. Several small project examples throughout the book that help solidify conceptual knowledge and provide invaluable exposure to streaming systems.

Geek review:

Event Streams in Action occupies the same overall niche as Kafka Streams in Action. The two have many things in common, with the latter covering Cloud-based Kafka alternatives, such as AWS Kinesis as well as related technologies such as AWS Lambda and AWS Redshift. The book starts with a ground-level foundation of events and event streams, and ties these concepts with a unified log. As another upside, the book has an ample number of examples (written in Java) and explains the key concepts well. Whilst the book’s extended title mentions Kafka, its coverage is modest while the emphasis is heavily skewed towards AWS proprietary offerings. This is not necessarily a deal-breaker, given the popularity of AWS in the microservices space. In saying that, we couldn’t help but wonder whether the outlandishly high review on Amazon (4.7) compared to both Goodreads and that of our own geeks was somehow due to this. Overall, we think there is still value in the book at this price, although we would be more inclined towards recommending Effective Kafka for all things Kafka and Kafka Streams in Action for stream processing.

Get Event Streams in Action on the Manning Website

5th Place — Building Data Streaming Applications with Apache Kafka

Authors: Manish Kumar, Chanchal Singh

Publisher’s description:

This book is a comprehensive guide to designing and architecting enterprise-grade streaming applications using Apache Kafka and other big data tools. It includes best practices for building such applications, and tackles some common challenges such as how to use Kafka efficiently and handle high data volumes with ease. This book first takes you through understanding the type messaging system and then provides a thorough introduction to Apache Kafka and its internal details. The second part of the book takes you through designing streaming application using various frameworks and tools such as Apache Spark, Apache Storm, and more. Once you grasp the basics, we will take you through more advanced concepts in Apache Kafka such as capacity planning and security.

Ratings:

Amazon: 4.0/5

Goodreads: 2.9/5

Geeks: 3.0/5

Overall: 3.30/5

Geek review:

Building Data Streaming Applications with Apache Kafka is introductory-level text with a reasonable breadth of coverage across messaging, Apache Kafka, as well as complementary stream processing frameworks and libraries, such as Apache Spark, Apache Storm and Apache Heron. While the book touches on an impressive number of technologies, we found the depth of coverage to be somewhat lacking, often comparable to the free material available online with some additional insight offered by the author in certain areas. The paperback is reasonably priced for its size and might be recommended for a beginner-level audience who are looking to get acquainted with the event stream processing ecosystem.

Get Building Data Streaming Applications with Apache Kafka on the Packt website

This concludes our geeky review on the most popular Kafka and stream processing books on the market. Although the experience with each book has been vast and varied, each of our geeks has highlighted learning something new with each one. Tune in to this blog for more great news, insights and resources from the exciting world of microservices.

--

--

Linda Yong-Ju
Microservice Geeks

A hopeless software developer. A busy mum. Editor-in-chief of the Microservice Geeks blog.