High availability Sidekiq with Redis Sentinel

Tom Dooner
Brigade Engineering
2 min readAug 30, 2014

--

Setting up a Sidekiq queue tolerant of Redis failure has never been easier. This post will help you get your Ruby application connecting properly so you can reap the uptime benefits.

At Brigade we strive to be resilient to machine failure, so we use Redis Sentinel to create a cluster of replicas which will maintain the proper configuration in case of a Redis node accidentally being taken offline.

Configuring Sidekiq

Assuming you have a cluster of Redis sentinels already, you will need to configure Sidekiq to use them. If you simply follow the Sidekiq instructions and set a Redis URL, you will not reap the benefits of Redis Sentinel.

With this configuration, Sidekiq will correctly connect to your Redis cluster:

Put this in config/initializers/sidekiq.rb for Rails. An after_fork block in config/unicorn.rb is also a good spot.

Don’t forget to put the same configuration in a Sidekiq.configure_server block!

Verifying Redis Failover

Once you have a Redis cluster, how can you test the actual failover behavior? After installing the redis and redis-sentinel gems, it’s easy enough to watch the sentinels elect a new master.

The script below connects to your Redis cluster (via the Sentinels) and enters an infinite loop doing the following:

  1. Stores a random value in the key ‘foo’
  2. Attempts to retrieve the value from the key ‘foo’
  3. Verifies that the initially set value equals the retrieved value.

While this is running, you should intentionally stop your Redis instance and watch as the Sentinels elect a new master.

Behold:

A short script to test failover of a Redis cluster

The script is pretty noisy, but the output looks something like:

[Redis] command=SET args=”foo” “3745174"
[Redis] call_time=77.90 ms
[Redis] command=GET args=”foo”
[Redis] call_time=77.87 ms
Success (3745174 == 3745174) from redis://redis01.example.com:6379/0
[Redis] command=SET args=”foo” “5595371"
Trying next sentinel: sentinel01.example.com:26379
[…snip…] // I manually stop the master Redis instance here
[…snip…] // ~11 seconds elapse
Trying next sentinel: sentinel03.example.com:26379
[Redis] call_time=11689.83 ms
[Redis] command=GET args=”foo”
[Redis] call_time=76.89 ms
Success (5595371 == 5595371) from redis://redis02.example.com:6379/0
[Redis] command=SET args=”foo” “8804528"
[Redis] call_time=77.21 ms

Avoid Critical Sentinel Bugs

Importantly, if you’re using Redis Sentinel, ensure you’re using a recent version of Redis as previous versions of Redis / Sentinel have had “critical” bugs.

Cover Photo: CC BY-ND 2.0 from https://www.flickr.com/photos/russphotography/5565481576 (thanks, Russ!)

--

--