Creating Highly Available Nodes on ICON — Stage 1: Active/Passive Failover with Pacemaker and Corosync

2infiniti (Justin Hsiao)
7 min readSep 4, 2019

--

The result of our testnet phase 2 stress testing simulation was devastating.

100% of the nodes were down, this is somewhat expected as all nodes were on a single bare-minimum docker container with testnet specs. The instances were no where near the recommended mainnet specs and most P-Reps have not yet created a HA architecture with some ways of failover or service recovery mechanism. With testnet phase 3 coming up, and full decentralization in less than a month, we should be ready to face extreme conditions and secure our network by keeping nodes alive.

Create P-Rep Node EC2 Instances

P-Rep nodes communicate through port 7100 for gRPC (peer to peer communication between nodes) and port 9000 for JSON-RPC API Server. Under Group Secuirty, create two custom rules for these ports and allow all IPv4 and IPv6 sources. Throw port 22 in so we can SSH into the servers to work directly.

We’ll be installing Corosync as our heartbeat and internal communication among cluster resources. Corosync uses UDP transport between ports 5404 to5406. Let’s enabled these ports as well

We’re aiming to build an Active/Passive configuration, having the passive node as a redundant failover node. For this basic setup we’ll need to create two instances, and an elastic IP address. Also, we’re going to purposely spin weaker instances, our ideal testing will be for the primary peer to fail and have secondary peer to recover the service during our testnet stress testing. Going up to the recommended specs, our primary node could possible handle all requests without any down time. We will scale up as soon as we’re done with testing the setup.

Once the EC2 instances are created, go to EC2 dashboard -> Elastic IPs. An elastic IP is a static IP address that points to one of your EC2. It allows you
to redirect network traffic to any of your instances when needed. This is used when we configure a domain or IP for our node peers to whitelist and making requests with each other. Now if one of our nodes goes down, we’ll dynamically point the IP to a different node, making service available again before the faulty node is restored. Let’s assign an elastic IP to one of the P-Rep nodes.

Optional

You can also setup a DNS record for this IP, this can be our peer endpoint to whitelist which will be used to exchange requests.

Install Corosync and Pacemaker

Next we’ll install Corosync (to both servers) as our messaging layer to the client servers, and Pacemaker as our cluster resource manager. Corosync is a dependency of Pacemaker so we’ll only need to install it.

$ sudo apt-get -y update
$ sudo apt-get install pacemaker

We’ll also need to install a management shell, some prefer crm and some prefer pcs . Either one will work, install

$ sudo apt install crmsh
$ sudo apt install pcs

Verify we have everything installed.

$ pacemakerd --version
Pacemaker 1.1.18
Written by Andrew Beekhof
$ corosync -v
Corosync Cluster Engine, version '2.4.3'
Copyright (c) 2006-2009 Red Hat, Inc.
$ crm --version
crm 3.0.1
$ pcs --version
0.9.164

Configure Corosync

Next we’ll need to create an auth key for the cluster, install haveged on either one of the servers, and generate a key.

# install package
$ sudo apt-get install haveged
# generate key
$ sudo corosync-keygen

Copy the same key to PRep-02,

$ sudo scp /etc/corosync/authkey username@PRep-02_ip:/tmp

then on PRep-02 window, move the file to the corosync folder

$ sudo mv /tmp/authkey /etc/corosync/

Next we’ll define the corosync.conf file, to make configuration a bit more convenient, let’s jot down various instance IP addresses that we’ll need, namely public, private and elastic IP.

Now edit the file /etc/corosync/corosync.conf on both servers, the files are identical except the bindnetaddr parameter will be the working server’s private IP.

$ sudo nano /etc/corosync/corosync.conf

Your config should look something like this

totem {
version: 2
cluster_name: nodecluster
transport: udpu
interface {
ringnumber: 0
bindnetaddr: current_instance_private_ip
broadcast: yes
mcastport: 5405
}
}
quorum {
provider: corosync_votequorum
two_node: 1
}
nodelist {
node {
ring0_addr: PRep-01_private_ip
name: PRep-01
nodeid: 1
}
node {
ring0_addr: PRep-02_private_ip
name: PRep-02
nodeid: 2
}
}
logging {
to_logfile: yes
logfile: /var/log/corosync/corosync.log
to_syslog: yes
timestamp: on
}
service {
name: pacemaker
ver: 1
}

then start corosync on both servers

$ sudo service corosync start

Verify that our nodes have joined as a cluster,

$ sudo corosync-cmapctl | grep members

then start pacemaker

$ sudo service pacemaker start

Our nodes should be online. Since we’re running a two node setup, both STONITH (a mode to remove faulty nodes) and quorum policy should be disabled.

$ crm configure property stonith-enabled=false
$ crm configure property no-quorum-policy=ignore

verify the configuration,

$ crm configure show

Configure AWS CLI

We will be using AWS CLI for the elastic IP reallocation, for this we first need to install the CLI executables and configure a few settings, you will need AWS Access Key ID and AWS Secret Access Key

  1. Log in to your AWS Management Console.
  2. Click on your user name at the top right of the page.
  3. Click on the Security Credentials link from the drop-down menu.
  4. Find the Access Credentials section, and copy the latest Access Key ID.
  5. Click on the Show link in the same row, and copy the Secret Access Key.
$ sudo apt update
$ sudo apt install aws-cli
# change to root, this is necessary
$ sudo su -
$ aws configure

Next we’ll need a resource agent to manage the elastic IP, we can use aws’s eip resource agent awseip, which is found in /usr/lib/ocf/resource.d/heartbeat/awseip

Also add, AWS_DEFAULT_REGION=<AWS-Default-Region> at the end of /etc/systemd/system/multi-user.target.wants/pacemaker.service

then we’ll create a primitive resource for the agent to manage. A primitive resource is a singular resource that can be managed by the cluster. That is the resource can be started only once. An IP address for example can be primitive and this IP address should be running once and once only in the Cluster

$ sudo crm configure primitive elastic-ip ocf:heartbeat:awseip params elastic_ip="your_elastic_ip" awscli="$(which aws)" allocation_id="your_elastic_ip_allocation_id" op start  timeout="60s" interval="0s" on-fail="restart" op monitor timeout="60s" interval="10s" on-fail="restart" op stop timeout="60s" interval="0s" on-fail="block" meta migration-threshold="2" failure-timeout="60s" resource-stickiness="100"

your_elastic_ip is the elastic IP we allocated and associated to PRep-01 earlier, its allocation ID can be found under EC2 Dashboard -> Elastic IPs -> Allocation ID. Check the status again,

$ sudo crm status

The elastic-ip resource should be started on our first peer node. At this moment, we have an active node (PRep-01), a passive node (PRep-02) and an elastic IP pointing to the active node. Whenever our node becomes inaccessible, the resource agent should automatically point the floating IP to the backup node. Let’s test this.

On a 3rd instance (I am using local computer here), curl the elastic IP address

localhost$ while true; do curl elastic_ip; sleep 2; done

Which should constantly pull the content of the index page. For testing purposes, modify the index page, (for nginx the default index page is /var/www/html/index.nginx-debian.html) with instance number and IP. Let’s simulate active node down time by,

$ sudo crm node standby PRep-02

You may or may not see service interruption like this message on the 3rd terminal

curl: (7) Failed to connect to node.icxstation.com port 80: Connection refused

But either way, you should see the curl to the elastic IP now showing contents from the backup instance within a few seconds.

Also if you go back to the EC2 dashboard, you’ll notice the elastic IP has been automatically reassigned to the 2nd instance. You can bring the node back up via

$ sudo crm node online PRep-02

At this point we have completed a basic failover setup in active/passive configuration, with a floating IP automatically reassigned when the active node goes down. Stay tuned for stage 2, where we’ll be configuring ICON’s citizen nodes, P-Rep nodes and nginx reverse proxy configurations!

--

--