Distributing load with Apache Kafka and Spring

Published in

Trabe

4 min readApr 15, 2019

Photo by Gerrie van der Walt on Unsplash

In recent years we have been involved in a variety of projects comprised of distributed components that have all sorts of communication needs between them.

We wrote about one specific case a while back. In that case, we used MQTT for cache invalidation. This story is, in some sense, a follow up to that one, focusing on a completely different scenario and technology but sharing the main subject of “real time” communication between distributed systems.

The problem we have

In our project, we have components generating events (Producers) and components that need to consume those events (Consumers). Every component is required to be highly available and is, as a result, distributed in several nodes.

We need the Consumers to be able to properly balance between all the available nodes the load associated with the consumption of events. We also need them to survive to the temporary loss of any number of nodes and to be able to gracefully resume the consumption process in the event of a catastrophic failure (all the nodes of a given consumer are out of service).

Kafka to the rescue

Not the Franz Kafka, we mean Apache Kafka. Kafka is a distributed streaming platform. Its capabilities are clearly stated in the project home page:

Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.
Store streams of records in a fault-tolerant durable way.
Process streams of records as they occur.

There is an excellent introduction to Kafka published on the project’s page. We assume some basic knowledge about how Kafka works in this story but we are going to expend a couple of minutes in the key concepts for understanding this story:

The relationship between topics and partitions and parallelism.
The retention policy and the metadata about consumers.
Consumer groups.

Topics, partitions, and parallelism

Kafka maintains a partitioned log for every topic. Every partition has a unique identifier for every record in that partition (offset).

Partitioning the log provides an opportunity for storing huge amounts of data in a topic (more than the amount of data that a single server can handle) and also acts as the unit of parallelism. We will go back to this in a while. For now, we just need to know that the number of partitions we have for a given topic will be a factor in how we can distribute the consumption of that topic.

Retention policy and metadata about consumers

Kafka can retain the log for every topic for a given amount of time. This means that the records will be available to consumers for a configurable amount of time after being generated.

Kafka also stores metadata about consumers: the platform knows which offset was last consumed by every consumer.

Consumer groups

Consumers can label themselves with a consumer group name. When they do that, each record published to a topic will be delivered to one and only one consumer in the consumer group.

We can use this feature to achieve load balancing: all the nodes in a given consumer system will be labeled with the same group name and as a result, the records will be distributed between them.

We talked earlier about the role of the number of partitions in how the load can be balanced. In Kafka, the consumption is handled dividing up the available partitions between the consumer instances so that each instance is the exclusive consumer of some of the partitions at any point of time. This means that the number of consumer processes at any point of time is bounded by the number of partitions in the topic being consumed.

Implementation

With the building blocks provided by Kafka, we can implement load balancing between consumer nodes without much effort.

Use the proper number of partitions

The first thing to take into consideration is how many partitions are we going to have for our topic of interest. The number of partitions should be equal to or greater than the number of consumer process in our system if we want to maximize parallelism. Of course, there are more implications than load balancing when selecting the number of partitions in a topic, but we are just focusing on maximizing parallelism here.

Create our consumer

We are going to use Spring boot for our consumer component. We need to add the Spring Kafka dependency to our project:

We need to tell spring what are our broker IPs. We can do that by adding the following properties to our application properties:

We want our system to consume every unconsumed record in the system so we need to, again, add the following to our application properties:

We need a final property for setting the consumer group:

With that configuration in place, we can write our consumer, taking advantage of the @KafkaListener annotation:

As you can see, we can use some annotations in the parameters of a @KafkaListener:

@Payload: the record itself. We are using a plain old String here, but with some additional configuration, we can use different libraries to deserialize the payload to an arbitrary type. Spring Kafka supports popular serialization formats like JSON or Avro
@Header: provides access by name to the record headers.
@Headers: provides access to all the record headers.

Apart from the use of the annotations to access the relevant fields of the record, the implementation is as naive as it gets: we just call a method with different parameters depending on the value of a header.

We can now deploy our application to a number of nodes (let’s say 2) and see how the records are balanced between them.

Wrapping up

Kafka is a platform that can be used in different ways to cover a variety of inter-application communication scenarios. One of those scenarios is providing load balancing and fault tolerance consuming records generated by external producers. Spring Kafka provides a really easy-to-use abstraction for generating Kafka consumers.