Rails, Sneakers, and Exponential Backoff Retry

Dong Wook Koo
3 min readApr 18, 2017

--

A Rails project with RabbitMQ as a queue provider uses sneakers for enqueuing and processing background jobs. Sneakers comes with a Maxretry handler which can retries jobs every 60 seconds up to 5 times by default. But I’d like it to have an ability to do exponential backoff retries like sidekiq and delayed_job out of the box. The sneakers documentation talks about it but leaves implementation to us users. This is an exercise to possibly provide sneakers with such capability. And I want to have it integrate into sneakers the way Maxretry handler has.

Let’s review Maxretry handler and see what it does. Maxretry handler creates ‘{queue-name}-retry’ and ‘{queue-name}-error’ queues. It also creates ‘{queue-name}-retry’, ‘{queue-name}-retry-requeue’, and ‘{queue-name}-error’ exchanges. All with ‘#’ as the routing key. I have wrote about it in detail in the previous post. If the queue-name is ‘primary’, the message travels from ‘primary’ queue to the worker and if retry is needed then the message is sent to ‘primary-retry’ exchange. From there, the message is sent to ‘primary-retry’ queue to sit there for 60 seconds (x-message-ttl) then sent to ‘primary-retry-requeue’ exchange which will queue it back to ‘primary’ queue for retry. So in order to do exponential backoff retries, it is apparent that I need to setup multiple queues with different x-message-ttl values to achieve it.

Next is to come up with time to live values for these queues and make it configurable. The formula for exponential backoff is out there and I’ve modified it to have it produce set of numbers given a couple of parameters.

(X + 15) * 2 ** (count + 1)# X = 0, 30, 60, 120, 180, etc defaults to 0# with count from 1 to 5 where 5 is the retry_max_times
# yields [60, 120, 240, 480, 960] for X = 0
# yields [120, 240, 480, 960, 1920] for X = 15
# yields [180, 360, 720, 1440, 2880] for X = 30
# yields [240, 480, 960, 1920, 3840] for X = 45
# functionality available as Sneakers::Handlers::Expbackoff.backoff_periods(retry_max_times, X)

I’ve named the queues using the format ‘{queue-name}-backoff-{period}’ so that queue names become ‘primary-backoff-60’, ‘primary-backoff-120’, ‘primary-backoff-240’, etc with x-message-ttl set to 60000, 120000, 240000, etc respectively. To route message through these queues, ‘{queue-name}-backoff’ exchange is created which in this case is ‘primary-backoff’. This ‘primary-backoff’ exchange is a headers exchange and the backoff queues are bound to it by header argument - backoff: 60, backoff: 120, backoff: 240, etc.

With these queues and exchange created, only thing left to do is to route the message to ‘primary-backoff’ exchange instead of ‘primary-retry’ exchange within the handler. So, instead of rejecting the message which will send it to the ‘primary-retry’ exchange via x-dead-letter-exchange setting, message is now published to ‘primary-backoff’ exchange. Once the message is sent to backoff queues, it will sit there for prescribed amount of time then will be sent to ‘primary-retry-requeue’ exchange and routed back to ‘primary’ queue for message processing. This way, ‘primary-retry-requeue’ and ‘primary-error’ exchanges and ‘primary-error’ queue can be reused if setup already by Maxretry handler.

This implementation is here (expbackoff.rb). It makes publish call to ‘retry_backoff_exchange’ with header arguments (:backoff and :count) along with a ‘routing_key’. The header arguments are used to route it to proper backoff queue. Then when it is sent to the ‘retry_requeue_exchange’ after serving its time, the ‘routing_key’ is used to send the message to original queue for processing. It can be used with following configuration for global retry setup.

Sneakers.configure {
retry_backoff_exchange: 'activejob-backoff',
retry_error_exchange: 'activejob-error',
retry_requeue_exchange: 'activejob-retry-requeue'
}

and

class PrimaryWorker
include Sneakers::Worker
from_queue :primary, {
handler: Sneakers::Handlers::Expbackoff,
retry_routing_key: ‘primary’,
arguments: { :’x-dead-letter-exchange’ => "activejob-retry" }
}
def work(msg)
begin
job_data = ActiveSupport::JSON.decode(msg)
ActiveJob::Base.execute job_data
ack!
rescue
reject!
end
end
end

The ‘x-dead-letter-exchange’ argument is not necessary but keep it there if the queue is created with Maxretry handler. By keeping it there and making ‘retry-error-exchange’ and ‘retry-requeue-exchange’ same as Maxretry handler configuration, handlers can be interchangeable.

UPDATE: Delay feature is posted here.

--

--