A common sight in Scala code is to see
Future[T] used as the result of network-based operations (such as a web-request). However, when dealing with a network, failures can be common. It’s likely you’ll want to retry this operation, but should only be done so if coupled with exponential back-offs. This post will show you how we can easily compose futures to achieve this.
For our examples to actually work (in a REPL anyways), let’s make up a mock API to use.
Note: The code-blocks are structured to be pasted into a REPL. Each example depends on functions and imports from a previous example, so keep the REPL running for the entire article. You will need the Akka library, but can run
sbt console using this project file.
Basic Retries (no back-off)
Before we can get into exponential back-off, we first need a basic retry in place. Let’s wrap our
networkRequest with a function that retries infinitely.
recoverWith we can define a “fall-back” future that will be used when the original future fails. However if we make this fall-back future just a call to the same function, we generate an infinite loop on the failure case. If you run this in your REPL, you may see something like:
res24: scala.concurrent.Future[String] = Future(<not completed>)
This tells us that the future eventually succeeded after 4 retries, so success!
Retries are awesome but, with no wait between tries this can waste local resources (think busy wait) and potentially compound issues for an upstream service (for example, a service is unable to respond because of too many open connections).
But how do you make a
Good question! Since
Future is composable, we can simply combine any
Future value with another
Future that completes after some specified period of time. Easy peasy.
However, if you immediately thought of:
Stop it! This is bad, never do this.
Okay, fine… But how do I make a
Futurewait to complete?
The proper way to “sleep” a
Future is with a scheduler that holds the
Promise and only completes it when the specified time-period has passed. This only requires one thread to managing sleeping
Futures instead of blocking multiple threads with
Thread.sleep(). But that’s a lot of work, and Akka already has this feature built-in, so I’m going to take a shortcut.
In this example
x is a
Future that does not complete for at least two seconds. But more importantly, the
Future on line 8 does not start executing until the first “sleeping”
Future completes. With this, we can modify our
networkRequestWithRetries function to use a constant back-off.
We now have a function that retries (infinitely I might add) and includes a 2-second wait between retries. Using composition makes this rather easy, but can be hard to wrap your head around. I find it easier to visualize as a state machine.
Future can only result in success or failure, each state has only two possible transitions. A0 through An are attempts to call our networking API which will either succeed (
res) or fail (
sleep). The failure-transition to
sleep is represented as
recoverWith in our retry code. And the following transition from sleep to A(n+1) is represented by
flatMap. But really, each An step is the same and we could simplify our state-machine to be:
If we already have retries with a constant back-off, why do we need exponential back-off?
A constant back-off is a good start, but it can be hard to pick a good constant. You want retries to be fast (especially in the case where failures are intermittent) but you also don’t want to waste resources. Using a backoff that grows exponentially allows you to meet both goals.
We can see with a simple diagram how the time grows with each retry:
Continuing this pattern, we would get sleep times of 8, 16, 32, etc. Where you start your timing and how fast it grows is all up to you. So let’s see what it looks like to add exponential back-off to our code:
The main difference in this code is that we’ve added some parameters to define the scaling factor (
n^x), the initial value (
n^x), and a “current” value which is the previously calculated wait time. We use the fact that we can capture these values in a partial function (via
recoverWith) to create a recursion of sorts.
Running in the REPL, we can see that it works:
retrying after 1 ms
retrying after 2 ms
retrying after 3 ms
retrying after 5 ms
retrying after 8 ms
retrying after 12 ms
retrying after 18 ms
retrying after 27 ms
retrying after 41 ms
retrying after 62 ms
retrying after 93 ms
retrying after 140 ms
Using composition with
Future, it is easy to add additional behaviors and build complex state-machines that provide more flexible and powerful APIs and abstractions. Exponential back-offs is just one example.
Watch out for future articles covering other