The Aleph
Published in

The Aleph

Monitor Lizard from Palawan is licensed under CC BY 2.0

Monitor your Octez Node on Kubernetes

Monitoring your nodes is an essential aspect of your Tezos operations. Since version 14, Octez is exposing a native metrics port.

A “metrics port” is a simple http endpoint that, when queried, returns various data about the node status. The data comes in a standard format that can then be fed to an aggregator like Prometheus.

The tutorial on OpenTezos explains how to set up metrics on your server. However, metrics really shine in a container cluster environment. In this tutorial, we will explain how to deploy a Tezos Node in a Kubernetes cluster on DigitalOcean, set up monitoring, and deploy Grafazos, a custom Grafana dashboard for Octez.

Kubernetes

Kubernetes is a “cloud operating system”. It has a reputation for being overkill for many applications, however it is a mature interface to deploy workloads in the cloud in a vendor-agnostic way. The concept of “metrics” comes from the same lineage of ideas that gave us Kubernetes.

The Tezos-k8s project by Oxhead Alpha is a swiss-army knife that lets you deploy Tezos workloads in the cloud: baker, RPC service, payout engine, even a private chain. Give us a star on Github.

Here we demonstrate a deployment with a single Octez mainnet node on DigitalOcean, an affordable cloud platform.

Deploy a Kubernetes Cluster

First, create a new Kubernetes Cluster from the DigitalOcean console. We chose the following config:

Once provisioned, download the kubeconfig file from the “Actions” menu on the top right:

Download kubectl utility and the k9s tool.

You can now visualize your cluster using k9s:

k9s --kubeconfig=k8s-1-24-4-do-0-sfo3-1666988274276-kubeconfig.yaml

Export your kubeconfig path as an environment variable:

export KUBECONFIG=k8s-1-24-4-do-0-sfo3-1666988274276-kubeconfig.yaml

(replace the yaml file name with the file you downloaded from the console)

Deploy the Monitoring Stack into your Cluster

Install the Helm command line tool. Helm is a package manager for kubernetes.

Then install the Prometheus Operator, a package to deploy a complete monitoring stack, including Prometheus and Grafana, in one step.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm install dok8s-monitoring prometheus-community/kube-prometheus-stack

Then go look at k9s console: your cluster now has Prometheus and Grafana pods running.

Install Octez

The Tezos chart lets you deploy an Octez node on any network, or even a set of nodes on a private, isolated chain.

Here we will start a Tezos mainnet node with monitoring enabled.

First, create a values.yaml file with the content shown below:

serviceMonitor:
enabled: true
labels:
release: dok8s-monitoring

This instructs Helm to install the tezos chart with a properly labelled “Service Monitor”. A Service Monitor is a Kubernetes abstraction that designates the Octez node as a target to be scraped by Prometheus.

Install the chart:

helm repo add oxheadalpha https://oxheadalpha.github.io/tezos-helm-charts/
helm install -f values.yaml tezos oxheadalpha/tezos-chain

On k9s, you should now see a Tezos pod named “rolling-node-0”. After a while, it will become blue (healthy) which means that it has downloaded a snapshot and synced:

Let’s verify that Prometheus is scraping your node: open a port-forward to port 9090 of the pod called prometheus-dok8s-monitoring-kube-prom-prometheus-0 . With k9s, this is easily done with Ctrl+F.

Then, on your browser, go to http://localhost:9090/targets and verify that the tezos-service-monitor target has one endpoint up:

Installing Grafazos

The Prometreus Operator also installed Grafana into your cluster. Open a port forward of port 3000 of the Grafana pod with Ctrl+F:

Then go to http://localhost:3000 on your browser. The login is admin and the password is prom-operator.

Navigate to the Dashboards menu and click “+ Import”.

Grab the JSON file at https://gitlab.com/nomadic-labs/grafazos/-/packages and upload it.

That’s it! You now have a Grafazos dashboard. Save it with the 💾 icon:

Going further

Here are some suggestions as of what to do next:

Wrapping Up

Go back to the Digital Ocean console and destroy your Kubernetes Cluster and its associated Persistent Volumes.

References

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store