A Step by Step guide for Redis cluster in AWS

Kekayan
10 min readMar 22, 2019

Welcome back to another Step by step guide,this time both with the explanation of the theory and practical as always. Let’s dive in.

  1. what is Redis
  2. Under the hood of the Cluster
  3. Redis cluster on AWS
  4. Master-Slave Replication
  5. Fail Over Simulating

1. What is Redis? ( REmote DIctionary Server)

Redis is an in-memory database, simply which means Redis runs on RAM, which is also a key-value pair database & widely used as a caching database. According to db-engines.com (a site which ranks the databases), Redis is in the top for key-value pair database. The key space is split into 16384 slots, so the max size for the cluster is 16384 master nodes (But a max size of nodes is in the order of ~ 1000 nodes). With Redis, we can use replication, sharding, or both.

https://db-engines.com/en/ranking/key-value+store

Formal definition from the Redis official site.

Redis is an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.

2. Under the hood of the Cluster

simple cluster client communication

In Cluster nodes have Heartbeat procedure it makes them auto-discover other nodes, detect non-working nodes or Failing nodes, and promote slave nodes to master when needed in order to continue to operate when a failure occurs. All nodes/servers are directly connected with a service channel. To perform their tasks all the cluster nodes are connected using a TCP bus and a binary protocol, called the Redis Cluster Bus. Each node in the cluster uses a gossip protocol to communicate information about the cluster in order to discover new nodes, to send the ping to other nodes to make sure all the other nodes are working and to send cluster messages needed to signal specific conditions.

Ping uses the ICMP(Internet Control Message Protocol) to send and receive echo messages from and to the host and destination computers respectively to tell us about the performance of the network.

what happens between nodes to nodes

two nodes communication between them
  • They ping pong to check health .if we imagine it ‘ll be like below
Node 1 will ping
PING: are you ok dude? I’m master for XYZ hash slots.
Node 2 will reply
PONG: Sure I’m ok dude! I’m master for XYZ hash slots.
  • They use gossip to share info .if we imagine it ‘ll be like below
Node 1 Gossip to Node 2: these are info about other nodes
I'm in touch with:
Node 3 replies to my ping, I think its state is OK.
Node 4 is idle, I guess it's having problems
Node 2 Gossip to Node 1: these are info about other nodes
I'm in touch with:
Node 3 are fine and replied in time.
Node 4 is idle, for me as well! May be down :(

There are two flags that are used for failure detection that are called PFAIL and FAIL. PFAIL means Possible failure, and is a non-acknowledged failure type. FAIL means that a node is failing and that this condition was confirmed by a majority of masters within a fixed amount of time.

Above quoted one was taken from Redis official site. we find more info about Failure detection on Redis site

Replication

Replication, also known as mirroring, means copying all of the data to another location. This allows data to can be accessed from two or more locations, which ensures high availability. In other words, the replication process provides the facility for multiple reads or sorts at a time.

Another situation where Replication becomes quite handy is when one or few data locations (or Availability Zones) went down. The data associated with the lost node is still retrievable since one of the lost master’s slave will take up its position and serve the cluster like its master did since it has all the data from its master.

In Redis, we can set up replication by using a master-slave(s) setup for replication. In simple words, the master is the main node and slave(s) copies all the things from the master, so that, a slave can replace its master in case of failure. We will see this clearly in the hands-on part.

Sharding

Sharding, also known as partitioning, involves splitting data up based on key spaces. It is the mechanism behind clustering and scaling in Redis.

There are 16384 conceptual key spaces (Hash Slots) in Redis. What we do in sharding is divide this key spaces into multiple nodes instead of a keeping in a single node. It will make this key spaces larger and capable of keeping more and more data. For example, If you build up a cluster with 3 masters like recommended by the Redis.io, it will divide the KeySpace like:

  • Slot 0 to 5000 → Master 1
  • Slot 5001 to 10500 → Master 2
  • Slot 10501 to 16384 → Master 3

Assume that you have 16GB RAM machine. If you use 3 machines like that, for the Redis clustering, you will end up having 48GB RAM space. Hence, your one Keyspace will be larger in terms of size.

This increases performance because it reduces the hit on each of the individual resources, allowing them to share the load rather than having it all in one place.

3. Redis Cluster setup in AWS

  1. Let’s create 4 EC2 instances

Login to your AWS console and in services select ec2.Then click launch instance.

for this tutorial, I am creating t2.micro instances which have 1GB RAM.

Launching ec2

In the next step, I specified 4 instances as Number of instances to launch. The create key pair for ssh access and launch .you will see all 4 instances launched

ec2 dashboard after 4 instances created

2.Install the Redis

We’ll install the Redis in Redis1, Redis2, Redis3 machines and keep the Redis4 for master-slave replication configuration later.

steps we need to follow in each 3 servers

  • Download and Extract Redis
  • Install build-essential, make
  • Make Redis libraries
  • Install tk8.5 tcl8.5 — (C functions/Libraries)
  • Update configuration (redis.conf)
  • Start Redis Server

first, we log in to every 3 servers using ssh

after login to all 3 servers

Download and Extract Redis

You can use apt based packages also. In this guide, we will see the building from source.

you can simply get this script and run as a bash script or run each one by one.

Let’s create a directory called build and in that directory. Let’s Download Latest version of the Redis using wget. Finally, extract it and cd to the folder for the next step

Install build-essential, make

Since we are gonna build we need gcc and make so let’s first download and install them and build.

Now run the following command

we are gonna set memory allocators as libc

make MALLOC=libc

let’s run some tests to verify

sudo apt-get install -y tk8.5 tcl8.5 
make test

It will take some minutes to finish all the 50 tests. You should see a lot of green [ok] messages like below

If you reached this step without issues:).You have to build Redis Now Let’s Configure and run

Let’s update the configurations

Use your favourite editor like nano or vim to open config file.

sudo nano redis.conf

Modify these lines on each server in Redis.conf file as following. You can use ctrl+w in nano editor to search.

change <privateIp> to respective private IP addresses

bind <privateIp> 127.0.0.1 ::1
cluster-enabled yes
cluster-config-file nodes-6379.conf
cluster-node-timeout 5000
config file

Next start Redis Server On all 3 nodes

src/redis-server ./redis.conf
All 3 servers started

Now we need to connect them as the cluster.

So in our AWS EC2 security groups enable TCP over port 6379 and 16379.

Let’s cd to Redis

cd build/redis-5.0.4/

Now we need to install Redis CLI. Since we are going to use Ruby based Redis CLI. We install Ruby package manager to install it.If you get any issues skip next 3 steps and use the apt based repo to install cli.

Install GPG keys

sudo apt install gnupg2
gpg2 --recv-keys 409B6B1796C275462A1703113804BB82D39DC0E3 7D2BAF1CF37B13E2069D6956105BD0E739499BDB

Install RVM

\curl -sSL https://get.rvm.io | bash -s stable

Install ruby & Redis

source ~/.rvm/scripts/rvm rvm requirements
rvm install ruby
gem install redis

if the above 3 steps failed for you try this

sudo apt  install ruby
sudo gem install redis

Install Redis CLI

sudo apt install redis-tools

Create Connection

replace <ips>: port with correct values and run below command to create a cluster

redis-cli --cluster create <publicip1>:port <publicip2>:6379 <publicip3>:6379
Cluster Created and slots allocated to each node

Now finally we Test our cluster in a different node by running following command (replace IP)

src/redis-cli -h <privateip> cluster nodes
cluster nodes

4. Master-Slave asynchronous Replication

Master-Slave replication allows replicating data written to the master instance to a slave instance and allows slave Redis servers to be exact copies of their master servers. Replication is asynchronous because of performance concerns. But on the other hand, it means that Redis does not guarantee strong data consistency, which means, it is not possible to ensure the slave actually received a given write, so there is always a window for data loss.

There is another caching database called Scaled Out State Server which guarantees for no window for data loss .check my write up in medium about it.

I am here only doing for one server a salve. You can reproduce for other servers to ensure 3 slaves are available for each 3 masters.

we can use same servers for master salve like below which is known as cross replication. but for easiness of understanding .i ll configure a new server as salve.So you can visualize easily.

  1. Install Redis on the new server

I am using apt based methods here since i have told how to do it from source.

sudo apt-get install redis-server 
sudo apt-get install redis-tools

2.Configure slave

configuration file is located on /etc/redis/. Let’s edit

sudo nano /etc/redis/redis.conf

Edit as Following in config

bind 172.31.42.94 #Private IP 
slaveof 172.31.42.93 6379 #Redis Cluster Node2

Or run this command in a master

redis-cli --cluster add-node slaveip:port masterip:port --cluster-slave

Finally, start the slave server

sudo redis-server /etc/redis/redis.conf

In case If you want cross replication copy the redis.conf to another name and start new Redis server by

sudo redis-server /etc/redis/new-redis.conf

And do above setup for the master-slave configuration. With different port don’t forget to change the cluster config file and port in slave configuration files

Testing the cluster

we can add key value to one server and receive or retrieve on another server

Run following command in two servers

redis-cli -c -p 6379

in on server

set key valueexample :set kekayan medium-blog

another server lets get it

get keyexample :get kekayan
add & retrieve key value pairs

5.Simulating Fail-over

While i putting data, I have stopped one Redis instance which was a master and restarted it after some time. Now it runs as a slave.

#to get all nodes
redis-cli -p 6379 cluster nodes
# to get only master nodes
redis-cli -p 6379 cluster nodes | grep master

So how did that happen? Let me explain:

When a master fails, other nodes in the cluster gets to know it by the gossip port. Then what happens is all masters get together and cast vote for promoting one of the failed master’s slave as a new master. Highest voted slave become the new master and manage the key spaces that failed master had managed.

When the failed master recovers from failure and try to join the cluster again, his spot as a master is already occupied by the new master. Therefore, he has to become a slave in the cluster.

This is why my stopped node have become a slave when I restarted it.

That’s it for now

If you want further info or stuck in some steps feel free to ask in comments :)

Happy Hacking :)

--

--