spring-kafka re-try with spring-retry

Shanaka Fernando
3 min readNov 24, 2019

--

Imagine a service that uses a kafka consumer to read messages from a given topic and publish that message to a downstream system. What happens if the downstream system is unavailable for a moment? Do you still want to publish the rest of the messages even after knowing it is unavailable?

In this short article, I would like to discuss; how to use the spring-retry template to auto-retry messages in such scenarios. I hope you have a basic knowledge of the spring-retry template.

spring-kafka provides AbstractKafkaListenerContainerFactory to configure the spring-retry template. Using the spring-retry template you can set the number of retries and backoff time (after how many ms next retry should be started).

According to the above configurations, re-try will be executed 3 times in 5000ms interval. If the downstream system becomes available within these times, the service will be able to process the rest of the messages without any issue. If it is not available even after the maximum number of re-tries it should be alert back.

How do I know if all the retries are over?

Yes, for this matter you need to provide a recovery callback class, which must be the type of RecoveryCallback.

You need to implement the logic inside the setRecoveryCallback method to decide what should you do if all the retries are over.

How does spring-kafka invoke the re-try?

If Kafka consumer throws an exception while processing the message, spring-retry template will be invoked. You can control the types of exceptions that spring-retry template should be invoked. For demonstration purposes, following code snippet throws a RunTimeException if received message contains the word “Test”. So whenever consumer receives a message with the word “Test” re-try will be invoked.

What happens to messages consumer receives while re-trying?

While re-trying consumer thread gets suspended and there will be no Consumer.poll() calls to the assigned partition. Because of that, any messages receives to the partition will not be received to the consumer until it finishes the retry. Also, the consumer group’s offset will not get updated until that.

Beware

One of the methods used in Kafka to determine the health of the consumer is using the interval between previous and current poll calls. This is configured using max.poll.interval.ms (by default 5 mins) and if it exceeds, broker revokes the assigned partitions and rebalance. When configuring the re-try time you must consider this as well. This problem has been resolved using SeekToCurrentErrorHandler from version 2.1.3. We may discuss it in a separate article.

Output

The below screenshot shows a message (Hello) received during re-try (HelloTest message) has been processed once the re-try is over.

Source code of this demo can be found in https://github.com/shanakaf/kafka-auto-retry. Use http://localhost:9000/messages?message={message} to send messages using web service.

Happy Coding :)

--

--