Running Ethereum nodes in high-availability cluster on AWS
Many modern systems use Ethereum blockchain for mission-critical functionality.
In such systems, components like services, databases and queues usually are configured for high availability. Ethereum node just can’t be that single point of failure.
This article guides you through setting up high availability Ethereum node cluster using AWS EC2 and Application Load Balancer.
I assume you are familiar with Amazon Web Services EC2 and Load Balancer products.
Architecture overview
Solution highlights
- Loosely coupled: AWS Elastic Load Balancer is used as a single endpoint abstracting away failover functionality.
- Highly available: EC2 instances can be located in separate availability zones
- Secure: all proposed software packages are open-source
- Scalable: Load balancer naturally balances requests across nodes
- Easy to implement: minimal and simple configuration, minimal or no changes to the existing architecture.
How it works
- Ethereum node software is installed on EC2 instances.
- AWS Application Load Balancer redirects RPC JSON requests from the Client Application to one of the EC2 instances.
- In addition to the Ethereum node software, every EC2 instance also runs HTTP service eth-node-lb-healthcheck
- Load balancer connects to eth-node-lb-healthcheck over http to perform periodic health check.
This little HTTP service checks Ethereum node software health and synchronization state. To check synchronization, eth-node-lb-healthcheck obtains latest block from 3rd party service infura.io .
If Ethereum node software is not available or is not synchronized, it will return HTTP error code 500 and Load balancer will mark EC2 instance as unhealthy, redirecting other requests to other node(s)
Integration notes
When node goes offline, several requests might fail while consequent requests will be forwarded to healthy nodes.
Make sure to implement retry for any RPC JSON Ethereum service consumer.
Setting up EC2 instances
To further improve cluster availability, I recommend placing EC2 instances running ETH to 2 separate Availability Zones.
Tip: You can configure one EC2 instance and then clone it in another availability zone
Install Ethereum node software
If you don’t have nodes already running, install it use a guide like
https://github.com/ethereum/go-ethereum/wiki/Installation-Instructions-for-Ubuntu
Tip: You will be using AWS-provided URI to connect to your cluster, so make sure to add --vhosts=*
to GETH start string
Install node.js
~$ sudo apt update
~$ sudo apt install nodejs
Install npm
~$ sudo apt install npm
Install and configure eth-node-lb-healthcheck
~$ sudo npm install -g eth-node-lb-healthcheck
If you have installed eth-node-lb-healthcheck globally, configuration file should be located here:
/usr/local/lib/node_modules/eth-node-lb-healthcheck
Edit configuration file
~$ sudo vim /usr/local/lib/node_modules/eth-node-lb-healthcheck/.env
ETH_RPC_HOST: URI where Ethereum node software is running. Default: 127.0.0.1
ETH_RPC_PORT: RPC Port of the Ethereum node software. Default: 8545
ETH_NETWORK: Ethereum network. Possible values:
homestead
rinkeby
ropsten
kovan
goerli
Default: ‘homestead’
ETH_MONITOR_PORT: Port to run HTTP service. Default: 50000
Make sure this port is open in EC2 security groups
MAX_BLOCK_DIFFERENCE: Maximum blocks to consider Ethereum node software synchronized. Default:3
Install forever-service
~$ sudo npm install -g forever
~$ sudo npm install -g forever-service
Setup and launch
~$ sudo forever-service install eth-node-lb-healthcheck -s /usr/local/lib/node_modules/eth-node-lb-healthcheck/index.js --start
To setup service restart after server reboot
~$ sudo cat /home/ubuntu/forever.sh
#!/bin/bash
sudo service eth-node-lb-healthcheck restart
Create a cron job
crontab -e
@reboot sleep 60 && sh /home/ubuntu/forever.sh > /dev/null
Configuring load balancer
- Create Application Load Balancer
- Set load balancer port to 8545
- Set availability zones where your EC2 machines running node software are configured
2. Configure Security Groups
Important: Always control traffic to port 8545, never leave it wide open
3. Configure new Target Group
4. Register instances
Select instances and click “Add to registered”
5. Review and create load balancer
Testing the solution
From machine on the same LAN, try to connect to the geth cluster through ALB:
~$ curl -H “Content-Type: application/json” — data “{\”jsonrpc\”:\”2.0\”,\”method\”:\”eth_syncing\”,\”params\”:[],\”id\”:42}” http://<your_alb_uri>:8545
You should get response similar to this:
{“jsonrpc”:”2.0",”id”:42,”result”:false}
To test failover, try stopping ec2 instances in turn and connect using curl command above to check connectivity.