How To — Grafana Monitoring Dashboard for Shipchain Validator nodes

Shipmate.FR.nl
Shipchain (un) Official Community
6 min readAug 24, 2020
Example of a fully functionning Grafana Dashboard

End of July 2020, Shipchain launched its mainnet, a public delegated Proof of Stake sidechain of the Ethereum network (http://blog.shipchain.io/announcing-the-launch-of-shipchain-mainnet/).

Like every dPoS blockchains, blocks are validated by a few dozens of nodes at the max, called Validators, as a trade-off between decentralization and speed.

On top of securing their nodes, all Validators must guarantee maximum uptime to their Delegators who decided to stake their tokens to them since staking rewards are only allocated when the node is validating blocks.

In a humble attempt to roll-out best practices amongst all Shipchain Validators, the intent of this article is to provide a step by step procedure to set up a Grafana Dashboard, using Prometheus metrics and InfluxDB and Telegraf packages. This will be the requisite for the next article which will look into activating alerts on Grafana, when some critical parameters like Vote Percentage or Disk Space call out for immediate intervention from the sysadmin.

We will also cover in a third and last article how to set up a Dead Man Snitch on a Validator node, relayed in real-time to your mobile phone through PagerDuty — a complementary and crucial setup enabling any sysadmin to be properly notified, should all services (or the server itself) be completely down.

Now let’s get started…

We will first install and configure InfluxDB and Telegraf, then the Grafana server on your node and a few cronjobs. The next steps will then cover how to setup a pre-defined Dashboard looking like the one above.

STEP 1: Install the InfluxDB package

InfluxDB is the database where all data collected for the Dashboard will be stored (should any of these commands fail, feel free to refer to the official documentation: https://docs.influxdata.com/influxdb/v1.8/introduction/install/)

sudo curl -sL https://repos.influxdata.com/influxdb.key | sudo apt-key add -
source /etc/lsb-release
echo "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.list
sudo apt update
sudo apt install influxdb -y
sudo systemctl start influxdb
sudo systemctl enable influxdb
sudo apt install influxdb-client

Quick sense-check with the next 2 commands to make sure you get influxdb ports ‘8088’ and ‘8086’ on the ‘LISTEN’ state (note: you do not need to set up any firewall rules for these ports):

sudo apt install net-tools
netstat -plntu

Next 3 commands will create a database and a user both called ‘telegraf’ with a password of your choosing (just replace the text between the ‘ ‘)

influx
create database telegraf
create user telegraf with password 'yourpassword'
show databases
show users

STEP 2: Install the Telegraf package

Telegraf is pulling data from multiple sources and stores them in the InfluxDB database. Since this tutorial intends to deliver a keys-on-hand pre-defined Dashboard for a Shipchain Validator node, you will find a pre-defined telegraf.conf setting file at this address:

https://drive.google.com/file/d/1w0UF6s7YyV67daVtWaPBKENG-o8hE7_G/view?usp=sharing

There is a wide variety of plugins that can be added in the telegraf.conf file to customize your dashboard. Here is where you can read more about them: https://v2.docs.influxdata.com/v2.0/reference/telegraf-plugins/#:~:text=Telegraf%20is%20a%20plugin%2Ddriven,output%2C%20aggregator%2C%20and%20processor.

sudo apt-get update && sudo apt-get install telegraf
sudo service telegraf start
sudo mv /etc/telegraf/telegraf.conf /etc/telegraf/backup_telegraf.conf
sudo vim /etc/telegraf/telegraf.conf #copy paste content of telegraf.conf backup file
sudo systemctl restart telegraf
sudo systemctl status telegraf
The parts in yellow can be customized according to your own installation.

STEP 3: Install the Grafana Server

Should any of these commands below fail or become outdated, feel free to refer to the official documentation: https://grafana.com/docs/grafana/latest/installation/debian/

The last 2 commands should ensure that Grafana server will be accessible from the outside but steps may differ depending on your own node setup and security measures.

sudo apt-get update
sudo apt-get install -y adduser libfontconfig1
wget https://dl.grafana.com/oss/release/grafana_7.0.1_amd64.deb
sudo dpkg -i grafana_7.0.1_amd64.deb
sudo systemctl enable grafana-server
sudo systemctl start grafana-server
sudo systemctl status grafana-server
sudo ufw allow 3000
sudo ufw default allow outgoing

STEP 4: Set up the Cronjobs

This will set up automatic tasks running every minutes on your server, storing the data appearing when typing ‘hydra client status’ in a few txt files. One Telegraf plugin will pick up the content of these txt files and make them accessible variables on Grafana.

sudo apt-get install jq
crontab -l
crontab -e

Do note that the first row on the cronjob needs to be looking like

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/snap/bin

But the content of this row will differ for every users. You can fortunately check the PATH above which is correct for your own installation.

type hydra

This command should return something like “hydra is hashed (/usr/local/bin/hydra)”. In that case you needed to have at least PATH=/usr/local/bin. Then the next rows of the cronjob should include the following 3 commands (please adapt the path to your own installation-‘niko’ was my own username-and remember to amend the telegraf.conf file accordingly).

*/1 * * * * hydra -o json client status > /home/niko/hydraoutput.json
*/1 * * * * hydra -o json client status | jq .is_caught_up 1>/home/niko/is_caught_up.txt
*/1 * * * * hydra -o json client status | jq .is_a_validator 1>/home/niko/is_a_validator.txt

STEP 5: Activate Prometheus on your node

Please check that Prometheus is activated on your node. Go to /shipchain-mainnet/chaindata/config/config.toml file, under section [instrumentation] and make sure prometheus=true

STEP 6: Setup Grafana Data Source

Note: Following instructions below are a full copy-paste of the tutorial available here - credits and copyrights to its authors -https://www.howtoforge.com/tutorial/how-to-install-tig-stack-telegraf-influxdb-and-grafana-on-ubuntu-1804/

Open your web browser and type you server IP address with port 3000.

http://YOURIP:3000/

Login with the default user ‘admin’ and password ‘admin’.

Now you will be prompted with the page for changing the default password, type your new password and click the ‘Save’ button.

And you will be redirected to the default Grafana Dashboard.

Click the ‘Add data source’ button to add the influxdb data source.

Type details about the influxdb server configurations.

Scroll to the bottom page and type details of influxdb database settings.

  • Database: telegraf (unless you named it differently at step 1)
  • User: telegraf (unless you named it differently at step 1)
  • Password: ‘xxxxx’ (this is the password you used when creating the InfluxDB user and captured on the Telegraf.conf file)

Click the ‘Save and Test’ button and make sure you get the ‘Data source is working’ result.

The InfluxDB data source has been added to the Grafana server.

STEP 7: Setup Grafana Dashboard

Please first download the .json file available at the link below. It contains a pre-defined dashboard which you can take inspiration from to build your own. Then import it.

https://drive.google.com/file/d/14kkjURGG3zYifKz4QJHP7V6nyVCc-tF5/view?usp=sharing

Here is how to import a dashboard on Grafana.

Please follow the instructions there to install the plugin that displays the Shipchain logo: https://grafana.com/grafana/plugins/bessler-pictureit-panel/installation

LAST STEP: Enjoy!

Example of a fully functionning Grafana Dashboard

If all went well, you should now look at something like this. Feel free to customize it to your needs (and share the results with the team!) by learning from the various existing examples (some SQL basic knowledge would be ideal but not really required).

You are now ready for the next article: How-to set up alerts on any of these variables who will wake you and your about-to-upset girlfriend/wife/husband/dog in the middle of the night if something goes wrong on your node.

Stay tuned!

shipmateFRnl

--

--