An introduction to Apache Kafka and microservices communication

When we talk about microservices one of the first things that comes to mind is how to interact with all these tiny APIs and workers inside the system architecture. For this discussion we have basically two kinds of inter-service communication: synchronous (http/tcp requests) and asynchronous. This post aims to focus on the second option using Apache Kafka as the message broker between the services.

Use case

Let’s suppose we have a very simple scenario: a service responsible for creating new accounts in a banking system which needs to communicate to another service which is responsible for sending a confirmation email to the user, after the creation.

In a monolith system we would probably have all this logic in the same codebase, in a synchronous way. But in the shiny world of microservices, we have decoupled these responsibilities in two different projects and now we need to let the email service knows that a confirmation mail must be sent.

Solution

Here is the calling for our magnificent Apache Kafka. But what Kafka really is? From their site:

Kafka™ is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.

Kafka is a great fit for many use cases, mostly for website activity tracking, log aggregation, operational metrics, stream processing and, in this post, for messaging.

This platform provides us an elegant way to create a data pipeline where we can connect producers (New Account Service) which will be creating new records in the data pipeline and consumers (Email Service) which will be listening for new records.

If you come from a RDBMS world you can imagine that a topic is a table, a producer is someone who will be inserting new data to this table and the consumer is an application which will be querying your database to find new entries.

The main difference from the approach above is that when we are using a message queue solution we don’t need to keep pooling some database to check if we have new data, we will always be listening to some particular topics in order to trigger an action.

Please check the horrible diagram below that I made in 5 seconds to illustrate what I’m writing:

If you visit the Apache Kafka page you can find a very helpful sample to get your hands dirty and simulate this flow using just your terminal. It’s easy and fun.

Conclusion

There are some other platforms working on this kind of solution, as ActiveMQ and RabbitMQ, but Kafka seems to perform and scale in a better way. Anyway, each project is a different story and you need to evaluate the most reasonable option for your case :-)

This is a high level/introduction to Apache Kafka, if you want to dig a little deeper on this subject and discover what else Kafka can do for you (streams, for example), please check some references below: