Protofire Releases Monitoring Templates for Multicloud Blockchains

Published in

Protofire Blog

7 min readJul 19, 2021

Now, you can rely on 48 configuration templates to set up monitoring for cross-cloud blockchain networks in a few hours instead of weeks.

The challenges with monitoring

Monitoring tools are essential for maintaining healthy and robust operations. When running nodes on public (or permissioned) blockchain networks, operators need to configure monitoring tools for the network or node binary itself, as well as the underlying infrastructure it is built on.

Public blockchain networks offer many beneficial attributes, including, but not limited to, decentralization. The truth remains that in many cases, centralized parties are still utilized for activities such as node hosting, blockchain data APIs, block explorers, crypto market pricing and volume reporting, etc. Since many consumer-facing applications ultimately rely on the provider’s copy of the ledger (a node), monitoring controls — that allow operators to keep tabs on availability, block height, memory and processor utilization for node instances, database storage, as well as other metrics — are critical to a robust and reliable service offering.

Typically, when a new permissioned blockchain network is developed, it starts off with a handful of nodes hosted on a single cloud platform, such as Amazon Web Services (AWS), Google Cloud Platform (GCP), Microsoft Azure, etc. This enables the nodes to take advantage of scalability, stability, and security features specific to each cloud platform. To further improve stability, operators can expand the blockchain network with new nodes hosted across multiple cloud platforms.

In hosting nodes for public blockchains, operators have the option of launching nodes on elastic infrastructure services, as well as some managed service offerings, such as AWS Managed Blockchain.

Once the number of nodes in a permissioned blockchain network grows in size, and when they are scattered across multiple clouds, monitoring significantly increases in complexity. Additionally, when a single party hosts multiple nodes of different public blockchains (and builds apps that rely on these nodes for current and historical data), the need for proper monitoring becomes abundantly clear. These complications are further emphasized when an operator runs multiple blockchain networks or nodes of different public chains that are hosted cross-cloud.

Since there are no blockchain-specific monitoring tools, operators will have to use general-purpose tools, such as Logstash, Telegraf, Grafana, etc., to create a monitoring stack. Then, operators will need to write and test configuration files that are unique to each monitoring setup. Without broad experience with blockchain networks, as well as the aforementioned tools, this process may take days, if not weeks.

To simplify the deployment of monitoring tools for blockchain networks, Protofire is open-sourcing its monitoring templates.

48 preconfigured templates for 10 blockchains

Mature monitoring tools are essential to Protofire’s operations as a blockchain development workshop. Initially, our team did not find any open-source solution that could easily be integrated in our workflows. Protofire relied on Amazon CloudWatch to gather metrics such as CPU, memory, and disk space. While it provided a good starting point, it was still impossible to monitor all the necessary metrics.

In 2019, Protofire partnered with Armanino, one of the leading accounting and consulting firms in the USA, to develop a number of blockchain-driven auditing solutions, such as TrustExplorer. As part of the engagement, Protofire also helped Armanino to set up an AWS-based infrastructure with a strong focus on security and enable advanced monitoring. For this purpose, we’ve created a monitoring stack using Amazon Elasticsearch Service, Logstash, and Amazon CloudWatch.

Thanks to the collaboration with Armanino, Protofire realized that we’ve outgrown our original monitoring stack and need to move forward to keep up with an expanding infrastructure. However, it was impossible to rely on the same stack used by Armanino due to our infrastructure being distributed across four clouds: AWS, GCP, Microsoft Azure, and Digital Ocean.

Over time, Protofire took part in other blockchain projects, such as Meter, Avalanche, and Secret Network. Each project required time-consuming work to extend our monitoring system. As a result, our team created a collection of 48 templates that simplify configuration of monitoring tools for the following blockchain networks:

Let’s see how to kick this off in practice!

Setting up the monitoring stack

To make use of our monitoring templates, you need to choose the tooling that you will build the monitoring stack on. Our open-source solution allows you to choose between Amazon CloudWatch and Telegraf for logs and metrics ingestion, InfluxDB and Amazon Elasticsearch for data storage, as well as Grafana and Amazon CloudWatch for visualization.

After choosing the stack, refer to the specific section in our GitHub wiki to get step-by-step instructions on how to configure the stack.

If you’ve opted for the Telegraf and InfluxDB stack, let’s say, you need to proceed as follows:

Install InfluxDB and Grafana.
Create a database instance and two users: one for Telegraf agents with the WRITEprivileges and the second one for the Grafana process with the READprivileges.

To interact with the InfluxDB server, use the command below.

influx

Now, you are connected to the default InfluxDB server on port 8086. To create a new database, as well as the telegrafand grafana users with passwords like STRONG_PASSWORD and ANOTHER_STRONG_PASSWORD, the following InfluxDB queries can be used:

CREATE DATABASE telegraf
CREATE USER telegraf WITH PASSWORD ‘STRONG_PASSWORD’
CREATE USER grafana WITH PASSWORD ‘ANOTHER_STRONG_PASSWORD’
GRANT WRITE ON “telegraf” TO “telegraf”
GRANT READ ON “telegraf” TO “grafana”

More information around InfluxDB authorization is available in their official documentation.

Next, create a data source by logging into your Grafana instance and choosing Configuration -> Data Sources -> Add data source from the menu bar on the left. Choose InfluxDB as a data source type. To configure additional details, you have to define the following parameters:

URL (http://localhost:8086 can be used if Grafana and InfluxDB are on the same server.)
Database — the name of the database you created using influx
User — a username for the user with theREAD privileges
Password

After that, you can Save and Test the data source.

Then, install Telegraf. The tool has two configuration files. One is generic and is used to gather common system metrics, such as CPU, memory, and disk space utilization. The other file gathers blockchain-specific metrics, such as block number, node health status, import speed, peer count, etc.

Once you install Telegraf, you will see default configuration in the /etc/telegraf/telegraf.conf file. While it is possible to use the default configuration, your database will get bloated with irrelevant metrics. To enjoy only useful metrics, our team replaced the default /etc/telegraf/telegraf.conf file with a custom one. You can get this custom file by running the command below:

sudo wget -O /etc/telegraf/telegraf.conf https://raw.githubusercontent.com/protofire/monitoring/main/telegraf/telegraf.conf

The configuration file also contains InfluxDB database credentials. So, if you are using our custom file, do not forget to open it with your favorite text editor and replace the database credentials with your own (you will need a user with the WRITE privileges). Do not forget to restart Telegraf to apply all the changes in configuration files.

Now, you can create a dashboard in Grafana to visualize all the metrics from Telegraf. In the Grafana UI, choose Create -> Import -> Upload JSON file. Copy and paste the JSON file from this GitHub repo.

After these steps, you can start gathering system metrics, but not the blockchain-specific ones. In this wiki, you will be able to find code examples for 10 blockchain networks listed above, as well as instructions on how to use them.

For example, if you are using the xDAI network bridge, you may want to run the following command:

sudo wget -O /etc/telegraf/telegraf.d/xdai.conf https://raw.githubusercontent.com/protofire/monitoring/main/telegraf/xdai.conf

This will enable you to collect the block number from the xDAI OpenEthereum node.

The whole deployment process described above usually takes 2–3 hours to complete, while writing similar configuration files from scratch may take 40+ hours. By releasing these templates, both Protofire and Armanino strive to contribute to the ecosystem by saving time for the entrepreneurs that are just joining new blockchain networks Protofie and Armanino already participate in. Feel free to use our templates, provide your feedback by opening issues, and submit your pull requests to our repositories. Together we will be able to make these templates even better.

About the experts

Arsenii Petrovich is DevOps Tech Lead at Protofire specializing in building highly distributed and automated blockchain environments. Arsenii is a certified architect in Amazon Web Services, Microsoft Azure, and Google Cloud Platform, as well as a certified Kubernetes System Administrator.

Dzmitry Kliapkou is DevOps Engineer at Protofire with a strong Linux background and 6+ years of experience in administering, designing, implementing, managing, and monitoring systems. He is primarily focused on making the blockchain infrastructure, hosted by Protofire secure and robust. Dzmitry is also a Google Cloud Certified Professional Cloud Architect.

Protofire Releases Monitoring Templates for Multicloud Blockchains

The challenges with monitoring

48 preconfigured templates for 10 blockchains

Setting up the monitoring stack

About the experts

Written by Protofire.io