Geth hosting with Splunk

Antoine Toulme
Splunk DLT
Published in
4 min readMay 19, 2021

Blockchain offers an ideal of independence by inviting participants to enter a public, permissionless network of peers. When you participate, you can create and maintain your view of the data that transit on the chain, generally constituted of blocks and transactions.

This article will show you how to run the Ethereum client Geth, developed by the Ethereum Foundation. It will demonstrate how to expose and analyze its operation metrics, its logs, and most importantly its ledger data.

Running Geth

You can run Geth in a multitude of well-documented ways. You should choose the setup that works best for you. After listening again to Péter Szilágyi’s advice on best operations practices, it seems best to run Geth directly as a service. This allows you to monitor the host metrics, knowing no other service is competing for attention on the box. The install doesn’t use Docker to be as close to the metal as possible.

The service is rolled out using Ansible and then installed Geth with this service file:

[Unit]
Description=Go-Ethereum
Requires=network.target
After=syslog.target network.target
[Service]
Type=simple
ExecStart=sh -c "geth \
--{{ network }} \
--datadir /gethdata/ \
--nousb \
--ipcdisable \
--metrics \
--metrics.addr=127.0.0.1 \
--metrics.port=6060 \
--http \
--http.api=admin,db,eth,debug,miner,net,txpool,personal,web3 \
--http.port=18545 \
--http.vhosts=localhost \
2>> /gethdata/geth.log"
[Install]
WantedBy=multi-user.target

Note the deployment uses a Jinja2 template so it can interpolate the network based on how Ansible is configured.

Geth also uses a /gethdata folder to store data and the log file.

The setup exposes as many JSON-RPC APIs as possible to collect more data (See the section below Ingesting the ledger data). You should not expose all those methods to external callers, or anonymous callers can abuse your chain data, reset your chain head or get access to accounts. We recommend the use of a JSON-RPC proxy application whitelisting callers and filtering methods.

Collecting metrics and logs

Alongside Geth, the Splunk OpenTelemetry Collector runs to gather both logs from the Geth log file and metrics by connecting to the Geth node’s Prometheus port, sending them to the Splunk backend.

OpenTelemetry is an open source project that aims to help standardize how to set up to ingest metrics, traces, and logs. Splunk’s distribution comes with great documentation and offers Debian repositories to perform a safe installation.

In this particular case, the collector is installed through a manual installation using the Debian package. You can choose to use a simple installer as well, amongst other possibilities.

The OpenTelemetry Collector takes a yaml file as its configuration:

A few things to note in this configuration:

  • All host metrics are monitored with the hostmetrics receiver. It gives detailed memory, CPU and disk information.
  • The Geth client is scraped every 10s on localhost:6060.
  • The filelog receiver listens to the geth log file, parsing each line looking for the timestamp and sending all the content to Splunk.

All the environment variables are separated and filled out in a separate file (/etc/otel/collector/splunk-otel-collector.conf), per the installation instructions.

Ingesting the ledger data

Now that basic metrics are in place, Ethlogger comes into the picture to ingest the ledger data and Geth metadata.

Ethlogger runs as a Docker container to avoid installing nodeJS on the VM. It is configured to run with the host network mode, so it can access the local Geth port. It’s also running with the option to restart always.

The ansible task looks like this:

- name: Run ethlogger container
community.docker.docker_container:
name: ethlogger
image: ghcr.io/splunkdlt/ethlogger:3.0.1
volumes:
- /gethdata/ethlogger:/app
state: started
restart_policy: always
network_mode: host
env:
COLLECT_PEER_INFO: "true"
COLLECT_PENDING_TX: "false"
ETH_RPC_URL: "http://127.0.0.1:18545"
NETWORK_NAME: "{{ network }}"
START_AT_BLOCK: "latest"
SPLUNK_HEC_URL: "{{ splunkHecUrl }}"
SPLUNK_HEC_TOKEN: "{{ splunkHecToken }}"
SPLUNK_EVENTS_INDEX: "{{ splunkIndex }}"
SPLUNK_METRICS_INDEX: "{{ splunkMetricsIndex }}"
SPLUNK_HEC_REJECT_INVALID_CERTS: "{{ splunkRejectInvalidCerts }}"

Building the battle station

All the crucial operational data is streaming to the Splunk instance.

Splunk users can consult logs, metrics, and ledger data together.

Metrics and logs showing the overall health of the Geth node

Prometheus metrics from Geth are combined with resource usage metrics to understand the patterns of use and see what is impacting the node.

Splunk userssearch logs with infinite flexibility. For example, let’s see how many times the node went through a reorg:

From there, one can set alerts and watch the nodes. If something goes astray, the team will be notified and can look at dashboards to find out what is going on.

Data analysts can see how users send transactions to the chain, examining the parameters of a smart contract call.

Calling an approve function on an ERC20 contract

This is just the tip of the iceberg of possibilities offered by Splunk. Here is sneak peek showing a graph of transactions at EthDenver:

EthDenver analytics!

My colleague Stephen talks more about this in this blog post.

That’s it folks! For more on how we leverage Splunk’s unique data platform capabilities, please visit our website or email us at blockchain@splunk.com.

--

--