I’ll Do It Again: Using Retries to Ensure System Resilience

Implementing the optimal retry strategy to ensure the resilience of microservices-based systems.

Grzegorz Mika
7 min readSep 12, 2024
Photo by Priscilla Du Preez 🇨🇦 on Unsplash

System resilience

Modern web apps run in complex environments. They rely on many external dependencies, like databases and APIs. They rely solely on network communication. The Fallacies of Distributed Programming note that we can’t assume a perfect, secure network. In web environments, intermittent network failures and potential malicious actions are inherent challenges.

If an external dependency is unavailable, the client might retry the request, assuming the issue is temporary. This pattern is known as the retry pattern. The retry pattern allows automatic retries of failed requests. It has two parameters: the max retries and the delay between them. We will delve deeper into the details of this pattern in the next section.

Retry pattern

The retry pattern involves repeating an operation if a previous attempt fails. This can be implemented in a straightforward manner as follows:

--

--

Grzegorz Mika

Software engineer working with Golang, Python and JavaScript, passionate in making things better and faster.