Redis Cluster

Sharmilan Thavarajah
5 min readMar 22, 2019

--

This article is telling about setting up a Redis cluster on your local machine, its distributed storage concept and how it handles failover to utilize its performance.

What is Redis

Redis, which stands for Remote Dictionary Server, is a fast, open-source, in-memory key-value data store for use as a database, cache, message broker, and queue. Redis able to deliver millions of requests with in sub-millisecond response times per second. This create a huge impact in game development, Financial services, IOT and Healthcare systems.

Mostly impact of Redis gone through the gaming for enable Real-time leader-boards, real time analytics, and Real-time chat/media streaming etc.

Redis is developed using c++ language so it is cross platform. Another feature of redis is it is a open source system.

Why Use Redis?

  • It is incredibly fast. It is written in ANSI C and runs on POSIX systems such as Linux, Mac OS X, and Solaris.
  • Redis is often ranked the most popular key/value database and the most popular NoSQL database used with containers.
  • Its caching solution reduces the number of calls to a cloud database backend.
  • It can be accessed by applications through its client API library.
  • Redis is supported by all of the popular programming languages.
  • It is open source and stable.

Redis Use in the Real World

  • Some Facebook online games have a very high number of score updates. Executing these operations is trivial when using a Redis sorted set, even if there are millions of users and millions of new scores per minute.
  • Twitter stores the timeline for all users within a Redis cluster.
  • Pinterest stores the user follower graphs in a Redis cluster where data is sharded across hundreds of instances.
  • Github uses Redis as a queue.

What is Redis Cluster?

  • Horizontally scalable: We can add nodes. We can increase more and more capacity.
  • Auto data sharding: Redis cluster can partition the data and split it among the nodes in automatic way.
  • Fault tolerant: When node is loose, server can go down. But we can continue the operations and no data lost. That is mean highly available.
  • Decentralized cluster management system.: It is not a single node. It is like a orchestrates. Every node participate.

What we can get from Redis cluster?

  • The ability to automatically split your dataset among multiple nodes.
  • The ability to continue operations when a subset of the nodes are experiencing failures or are unable to communicate with the rest of the cluster.

Distributed Storage of Redis Cluster

Every key that you save into a Redis cluster is associated with a hash slot. There are 0–16383 slots in a Redis cluster. Thus, a Redis cluster can have a maximum of 16384 master nodes (however the suggested max size of nodes is ~ 1000 nodes). Each master node in a cluster handles a subset of the 16384 hash slots.

The distributed algorithm that Redis Cluster uses to map keys to hash slots is,

HASH_SLOT = CRC16(key) mod HASH_SLOTS_NUMBER

CRC stands for Cyclic Redundancy Check.

For a example,

How Redis handle the Failover

First up, you can certainly have more slaves than masters, which would be preferable to having a single slave per master so that you can have one fail and still have a backup slave for some redundancy post-failover.

If you have 3 masters w/ one slave each, then you can lose one master. The cluster needs to have a majority of the masters available for a failover to occur; after one master fails, there are still 2/3 left, so the failed master’s slave will fail itself over after its cluster-node-timeout has elapsed.

At that point you’ll have a fully functioning cluster w/ 3 masters, though only 2 of them will have a slave (at least until the failed master comes back online as a slave).

However, if you lose 2 of your 3 masters at the same time, no failover will occur because there will not be a majority of masters online. Your queries will get a ‘CLUSTERDOWN’ error until a majority of masters are back online. In the 3-master case that means one of the failed masters would need to come back online before a failover could occur to get the cluster out of failed state.

Definitely recommend you spin up a bunch of nodes on your local machine and give it a try. Issue ‘DEBUG SEGFAULT’ to a master and monitor the other nodes’ logs to see how they respond. Issue that command to two masters and watch the slaves continually try to contact the failed master without failing over.

How it utilize the performance?

In Redis Cluster nodes don’t proxy commands to the right node in charge for a given key, but instead they redirect clients to the right nodes serving a given portion of the key space.

Eventually clients obtain an up-to-date representation of the cluster and which node serves which subset of keys, so during normal operations clients directly contact the right nodes in order to send a given command.

Because of the use of asynchronous replication, nodes do not wait for other nodes’ acknowledgment of writes (if not explicitly requested using the WAIT command).

Also, because multi-key commands are only limited to near keys, data is never moved between nodes except when resharding.

Normal operations are handled exactly as in the case of a single Redis instance. This means that in a Redis Cluster with N master nodes you can expect the same performance as a single Redis instance multiplied by N as the design scales linearly. At the same time the query is usually performed in a single round trip, since clients usually retain persistent connections with the nodes, so latency figures are also the same as the single standalone Redis node case.

Very high performance and scalability while preserving weak but reasonable forms of data safety and availability is the main goal of Redis Cluster.

Setup Redis Cluster on your local machine

Automatic Approach

Redis comes with a tool named create-cluster, located at installdir/scripts/create-cluster. This allows you to avoid the manual configuration .

By default, this utility will create 6 nodes with 1 replica and will start creating nodes on port 30000. In order to not modify the utility, is recommended to create a config.sh script in the same folder as create-cluster with the following content:

PORT=STARTING-PORT-NUMBER
TIMEOUT=2000
NODES=6
REPLICAS=1

Start the node and create the cluster:

./create-cluster start
./create-cluster create

Stop the cluster:

./create-cluster stop

Clean up the folder:

./create-cluster clean

--

--