Creating Highly Available Nodes on ICON — Stage 1: Active/Passive Failover with Pacemaker and Corosync
The result of our testnet phase 2 stress testing simulation was devastating.
100% of the nodes were down, this is somewhat expected as all nodes were on a single bare-minimum docker container with testnet specs. The instances were no where near the recommended mainnet specs and most P-Reps have not yet created a HA architecture with some ways of failover or service recovery mechanism. With testnet phase 3 coming up, and full decentralization in less than a month, we should be ready to face extreme conditions and secure our network by keeping nodes alive.
- This tutorial is based on Ubuntu 18.04 using Amazon Web Services
Create P-Rep Node EC2 Instances
P-Rep nodes communicate through port 7100 for gRPC (peer to peer communication between nodes) and port 9000 for JSON-RPC API Server. Under Group Secuirty, create two custom rules for these ports and allow all IPv4 and IPv6 sources. Throw port 22 in so we can SSH into the servers to work directly.
We’ll be installing Corosync as our heartbeat and internal communication among cluster resources. Corosync uses UDP transport between ports 5404
to5406
. Let’s enabled these ports as well
We’re aiming to build an Active/Passive configuration, having the passive node as a redundant failover node. For this basic setup we’ll need to create two instances, and an elastic IP address. Also, we’re going to purposely spin weaker instances, our ideal testing will be for the primary peer to fail and have secondary peer to recover the service during our testnet stress testing. Going up to the recommended specs, our primary node could possible handle all requests without any down time. We will scale up as soon as we’re done with testing the setup.
Once the EC2 instances are created, go to EC2 dashboard -> Elastic IPs. An elastic IP is a static IP address that points to one of your EC2. It allows you
to redirect network traffic to any of your instances when needed. This is used when we configure a domain or IP for our node peers to whitelist and making requests with each other. Now if one of our nodes goes down, we’ll dynamically point the IP to a different node, making service available again before the faulty node is restored. Let’s assign an elastic IP to one of the P-Rep nodes.
Optional
You can also setup a DNS record for this IP, this can be our peer endpoint to whitelist which will be used to exchange requests.
Install Corosync and Pacemaker
Next we’ll install Corosync (to both servers) as our messaging layer to the client servers, and Pacemaker as our cluster resource manager. Corosync is a dependency of Pacemaker so we’ll only need to install it.
$ sudo apt-get -y update
$ sudo apt-get install pacemaker
We’ll also need to install a management shell, some prefer crm
and some prefer pcs
. Either one will work, install
$ sudo apt install crmsh
$ sudo apt install pcs
Verify we have everything installed.
$ pacemakerd --version
Pacemaker 1.1.18
Written by Andrew Beekhof$ corosync -v
Corosync Cluster Engine, version '2.4.3'
Copyright (c) 2006-2009 Red Hat, Inc.$ crm --version
crm 3.0.1$ pcs --version
0.9.164
Configure Corosync
Next we’ll need to create an auth key for the cluster, install haveged
on either one of the servers, and generate a key.
# install package
$ sudo apt-get install haveged
# generate key
$ sudo corosync-keygen
Copy the same key to PRep-02,
$ sudo scp /etc/corosync/authkey username@PRep-02_ip:/tmp
then on PRep-02 window, move the file to the corosync folder
$ sudo mv /tmp/authkey /etc/corosync/
Next we’ll define the corosync.conf file, to make configuration a bit more convenient, let’s jot down various instance IP addresses that we’ll need, namely public, private and elastic IP.
Now edit the file /etc/corosync/corosync.conf on both servers, the files are identical except the bindnetaddr
parameter will be the working server’s private IP.
$ sudo nano /etc/corosync/corosync.conf
Your config should look something like this
totem {
version: 2
cluster_name: nodecluster
transport: udpu
interface {
ringnumber: 0
bindnetaddr: current_instance_private_ip
broadcast: yes
mcastport: 5405
}
}quorum {
provider: corosync_votequorum
two_node: 1
}nodelist {
node {
ring0_addr: PRep-01_private_ip
name: PRep-01
nodeid: 1
}
node {
ring0_addr: PRep-02_private_ip
name: PRep-02
nodeid: 2
}
}logging {
to_logfile: yes
logfile: /var/log/corosync/corosync.log
to_syslog: yes
timestamp: on
}service {
name: pacemaker
ver: 1
}
then start corosync on both servers
$ sudo service corosync start
Verify that our nodes have joined as a cluster,
$ sudo corosync-cmapctl | grep members
then start pacemaker
$ sudo service pacemaker start
Our nodes should be online. Since we’re running a two node setup, both STONITH (a mode to remove faulty nodes) and quorum policy should be disabled.
$ crm configure property stonith-enabled=false
$ crm configure property no-quorum-policy=ignore
verify the configuration,
$ crm configure show
Configure AWS CLI
We will be using AWS CLI for the elastic IP reallocation, for this we first need to install the CLI executables and configure a few settings, you will need AWS Access Key ID and AWS Secret Access Key
- Log in to your AWS Management Console.
- Click on your user name at the top right of the page.
- Click on the Security Credentials link from the drop-down menu.
- Find the Access Credentials section, and copy the latest Access Key ID.
- Click on the Show link in the same row, and copy the Secret Access Key.
$ sudo apt update
$ sudo apt install aws-cli
# change to root, this is necessary
$ sudo su -
$ aws configure
Next we’ll need a resource agent to manage the elastic IP, we can use aws’s eip resource agent awseip
, which is found in /usr/lib/ocf/resource.d/heartbeat/awseip
Also add, AWS_DEFAULT_REGION=<AWS-Default-Region>
at the end of /etc/systemd/system/multi-user.target.wants/pacemaker.service
then we’ll create a primitive resource for the agent to manage. A primitive resource is a singular resource that can be managed by the cluster. That is the resource can be started only once. An IP address for example can be primitive and this IP address should be running once and once only in the Cluster
$ sudo crm configure primitive elastic-ip ocf:heartbeat:awseip params elastic_ip="your_elastic_ip" awscli="$(which aws)" allocation_id="your_elastic_ip_allocation_id" op start timeout="60s" interval="0s" on-fail="restart" op monitor timeout="60s" interval="10s" on-fail="restart" op stop timeout="60s" interval="0s" on-fail="block" meta migration-threshold="2" failure-timeout="60s" resource-stickiness="100"
your_elastic_ip is the elastic IP we allocated and associated to PRep-01 earlier, its allocation ID can be found under EC2 Dashboard -> Elastic IPs -> Allocation ID. Check the status again,
$ sudo crm status
The elastic-ip resource should be started on our first peer node. At this moment, we have an active node (PRep-01), a passive node (PRep-02) and an elastic IP pointing to the active node. Whenever our node becomes inaccessible, the resource agent should automatically point the floating IP to the backup node. Let’s test this.
On a 3rd instance (I am using local computer here), curl the elastic IP address
localhost$ while true; do curl elastic_ip; sleep 2; done
Which should constantly pull the content of the index page. For testing purposes, modify the index page, (for nginx the default index page is /var/www/html/index.nginx-debian.html
) with instance number and IP. Let’s simulate active node down time by,
$ sudo crm node standby PRep-02
You may or may not see service interruption like this message on the 3rd terminal
curl: (7) Failed to connect to node.icxstation.com port 80: Connection refused
But either way, you should see the curl to the elastic IP now showing contents from the backup instance within a few seconds.
Also if you go back to the EC2 dashboard, you’ll notice the elastic IP has been automatically reassigned to the 2nd instance. You can bring the node back up via
$ sudo crm node online PRep-02
At this point we have completed a basic failover setup in active/passive configuration, with a floating IP automatically reassigned when the active node goes down. Stay tuned for stage 2, where we’ll be configuring ICON’s citizen nodes, P-Rep nodes and nginx reverse proxy configurations!