Monitoring your StaFi and SaFiHub Validator

Monitor your CPU, RAM, Network and StaFi Chain stats

StakingBridge
4 min readMar 23, 2022

This solution uses Telegraf, Prometheus and Grafana to provide users and node managers a monitoring tool to analyze CPU, RAM, network interfaces and I/O wait along with metrics from Stafi Chain that will be displayed in the public dashboard. Why should you monitor your node using the public dashboard?

  1. Control the use of resources in your server.
  2. It allows you to detect problems, even before they happen.
  3. Maximizes the security of the StaFi network and minimizes the risk of slashing.
  4. Transparency: allows everyone to see how stable your validator works.

INSTALL TELEGRAF IN YOUR NODE SERVER AND MONITOR STAFI (MAINNET)

RHEL 8 / Centos 8/ Rocky Linux

sudo yum -y updatecat <<EOF | sudo tee /etc/yum.repos.d/influxdb.repo[influxdb] name = InfluxDB Repository - RHEL  baseurl = https://repos.influxdata.com/rhel/7/x86_64/stable/ enabled = 1 gpgcheck = 1 gpgkey = https://repos.influxdata.com/influxdb.key EOFsudo dnf -y install telegraf

UBUNTU 20.04

wget -qO- https://repos.influxdata.com/influxdb.key | sudo apt-key add -source /etc/lsb-releaseecho "deb https://repos.influxdata.com/${DISTRIB_ID,,} ${DISTRIB_CODENAME} stable" | sudo tee /etc/apt/sources.list.d/influxdb.listapt updateapt install telegraf

2. Now that we have Telegraf installed, we modify the telegraf.conf file to point to the influxdb that feeds stafimonitor.stakingbridge.com and to setup the alias used to track our node on the public dashboard.

Modify the file /etc/telegraf/telegraf.conf and use this 👉 config file.

Set this to a name you want to identify your node on the dashboard

hostname = “YOUR_NODE_ALIAS

The file will be such that:

# Global Agent Configuration
[agent]
hostname = "YOUR_NODE_ALIAS" # set this to a name you want to identify your node in the grafana dashboard
flush_interval = "15s"
interval = "15s"
# Input Plugins
[[inputs.cpu]]
percpu = true
totalcpu = true
collect_cpu_time = false
report_active = false
[[inputs.disk]]
ignore_fs = ["devtmpfs", "devfs"]
[[inputs.io]]
[[inputs.mem]]
[[inputs.net]]
[[inputs.system]]
[[inputs.swap]]
[[inputs.netstat]]
[[inputs.processes]]
[[inputs.kernel]]
[[inputs.diskio]]
[[inputs.prometheus]]
# ## An array of urls to scrape metrics from.
urls = ["http://localhost:9615"]
[[inputs.prometheus]]
urls = ["http://localhost:26660"]
# Output Plugin InfluxDB
[[outputs.influxdb]]
database = "metricsdb"
urls = [ "http://stafimonitor.stakingbridge.com:8086" ]
username = "metrics"
password = "password"

3. Once the file is edited launch telegraf to start monitoring.

sudo systemctl start telegraf

⚠️ In some distributions it is necessary to install influxdb for Telegraf to run correctly, if so, perform a standard installation, no need to configure anything else.

CONFIGURE AND MONITOR STAFIHUB (TESTNET)

  1. First all, you will be able to run StaFiHub node, you can found full instructions here: https://docs.stafihub.io/welcome-to-stafihub/developer/getting-started/join-the-public-testnet
  2. Configure config.yaml and edit this three options (usually at end of the file):
  3. prometheus = enable / prometheus_listen_addr = “127.0.0.1:26660” / namespace = “tendermint_testnet”
[instrumentation]
# When true, Prometheus metrics are served under /metrics on# PrometheusListenAddr.# Check out the documentation for the list of available metrics.prometheus = true
# Address to listen for Prometheus collector(s) connectionsprometheus_listen_addr = "127.0.0.1:26660"
# Maximum number of simultaneous connections.# If you want to accept a larger number than the default, make sure# you increase your OS limits.# 0 - unlimited.max_open_connections = 3
# Instrumentation namespace
namespace = "tendermint_testnet"

prometheus = enable

namespace = “tendermint_testnet

The file must look like this:

Monitor your node: https://stafimonitor.stakingbridge.com/

From top side, use the search engine where it says ¨Node¨ and look for the alias that you have configured in the telegraf.conf file. For example, searching “Stakingbridge_TR-3970X” shows stats for stakingbridge.com validator.

⚠️ “Please note that we are in experimentation stage, you may encounter errors, inaccurate metrics or bugs. Stakingbridge.com is not responsible for any inconvenience caused by the uses of this tool.

Website | Twitter | Telegram

--

--