Monitor Your Go Process Internal Metrics in Minutes

Published in

CyberArk Engineering

6 min readFeb 28, 2024

The Go Programming Language (or simply Go) provides the ability to get the process runtime metric such as memory used, garbage collector cycles and more.
Recently, Go (from version 1.16) introduced a new package called runtime/metrics. This package provides an improvement by providing a stable interface to get process metrics and metadata such as memory stats, without code changes from the consumer part. Keep in mind that future metrics are consumed in the same way.

In this post, I describe the basic kind of process metrics and how to expose them to a Prometheus Server and Open Telemetry collector with code examples. We also discuss future extensions, existing gaps and my personal opinion on that package.

There are three basic types of process metrics:

Counters. Counters are values which are monotonically increasing and a rate can be calculated upon them. For example, the number of visitors to your website and the rate per time frame can be measured as a total of 1 million or 10000 per day (I hope that your website gets more).
Gauges. Gauges represent a current value, such as current speed, pressure, etc.

Histograms. A representation of data that buckets a range of outcomes (counters or gauges) into columns along the x-axis.

Accessing Runtime Metrics Before the Metrics Package

The code example below describes the way metrics have been exposed prior to Go version 1.16.

In these versions, we needed to collect them from different functions and understand its description, type (counter / gauge) and numerical type (integer, float, double).

// PrintOldStyleMetrics… Printing the metrics based on runtime package. Each of the metric is retrieved in a different method.
// Each metric will have a different function
func PrintOldStyleMetrics() {
  fmt.Printf("Operating system is: %s\n", runtime.GOOS)
  // Get number of go routines, max os threads allocated to process and host number of cpus
  numGoroutines := runtime.NumGoroutine()
  fmt.Printf("The program is using %d go routines\n", numGoroutines)
  maxThreads := runtime.GOMAXPROCS(0)
  fmt.Printf("The program is configured to %d max threads", maxThreads)
  numCPUs := runtime.NumCPU()
  fmt.Printf("the host has %d cpus\n", numCPUs)
}

You can view the full code example here:

The New Go Runtime Metrics Package

The new Go runtime package (any Go version after 1.16) provides live information on the process metrics such as:

Estimated total CPU time spent running user Go code (/cpu/classes/total:cpu-seconds)
Cumulative sum of memory allocated to the heap by the application (gc/heap/allocs:byte)

Before this new package, metrics were exposed in different packages and formats such as: ReadMemStats(..) or NumCgoCall().

The new package runtime/metrics aggregates the list of metrics and their metadata in a single api call: metrics.All(). A documentation of the supported metrics is provided here and is updated for each version.

The returning metadata struct contains the following properties:

Name. The name of the metric which includes the unit. A complete name might look like “/memory/heap/free:bytes”.
Description. The description of the metric, its function, size, etc.
Kind .The type of metric (e.g., int64, float64, etc).
Cumulative. Indicates whether or not it’s useful to compute a rate from this value. This value distinguishes between a Counter (true) and a Gauge (false).

To get the runtime values of the metrics you need to call metrics.Read(samples) with the metadata of requested metrics.

In the code example below, we can see how to collect and print the metrics from the new package:

// main function to register metrics and start prometheus server
func main() {
  // Get descriptions for all supported metrics.
  metricsMeta := metrics.All()
  // Register metrics and retrieve the values in prometheus client
  addMetricsToPrometheusRegistry(metricsMeta)
  // Setup the prometheus metrics endpoint on port 2112,
  // to use it run http://ipadress:2112/metrics or set prometheus to that address
  http.Handle("/metrics", promhttp.Handler())
  http.ListenAndServe(":2112", nil)
}

To expose the metrics in Prometheus format you can import the Prometheus golang client library by running the following commands as described here.

The example code to instrument golang runtime/metrics package using Prometheus library is here.

The relevant instrumentation of these metrics starts with this function:

addMetricsToPrometheusRegistry(metricsMeta)

Once the metrics are configured in the Prometheus registry, they are collected when polling the Prometheus endpoint. The example above supports future metrics of the runtime/metrics package and exposes counters and gauges (histograms are excluded).

This repo contains another example for exposing the metrics using Open Telemetry instrumentation library here.

Monitoring and Visualizing: Our Process Reported Values

Once we expose our process metrics, either the runtime/metrics package or our custom once, we want to store them and visualize them. An easy way to implement this is to use a Prometheus server to collect the metrics and store their values in the server store. Visualizing them can be done using Grafana, a server that supports the visualization of metrics with dashboards and panels from different sources (including Prometheus).

The architecture of such a setup is depicted in the diagram below:

Architecture to visualize process metrics

Another approach for instrumentation is to use Open Telemetry instrumentation libraries and the open telemetry collector agent. The agent can collect, process and ship the metrics to multiple backends.

The architecture is depicted in the diagram below:

Multiple Observability Tools Architecture

Visualization Using Prometheus UI

Once you install a Prometheus server, you can visualize the metrics based on Prometheus UI.

For example, let’s say that the Prometheus server is configured to poll the metrics from this address ‘localhost:2112/metrics’(the endpoint that our code exposes).

After running the Prometheus server and letting the server collect a few cycles of metrics, it would be possible to see current and past metrics with the use of Prometheus UI. In this example, you can see in the graph the “memory/classes/total” metric converted to the Prometheus valid name as “memory_classes_total.”

*Prometheus graph of memory/classes/total*

My Own Thoughts About Runtime Metrics and Process Static Configuration

The new runtime metric package provides an improvement in the way we expose Go process metrics. They are now provided in a generic and comprehensive way, therefore enabling sending metrics without specifying explicitly the name of the metric. It is comprehensive because you can use the metadata on each metric that contains each type and help to convert it to the right type such as counter, gauge and histogram. This method is future proof for new metrics, without code changes from your part.

This example code shows how to consume the metrics from this package and how to integrate them with the Prometheus server or others using Open Telemetry Collector. Integrating to other monitoring servers such as nagios, datadog, influxdb should be similar.

A Shout Out to a Responsive Go Development Team

I had the opportunity to discuss additional metrics which are not available in this package (NumCgoCall, InUseBytes ) with the Go development team. I was delighted to discover that the team included these metrics in the next versions, and CPU metrics as well!

I would like to personally thank Sean Liao , Michael Knyszek, Russ Cox and others who participated in this effort for their responsive approach to the community’s needs.

During those discussions, the team shared their philosophy that this package should contain changing metrics that are not intended for static configuration such as Go Version, Max maximum number of CPUs (GOMaxPROCS), etc.

However, this generic approach of information with metadata is useful to static configuration parameters as well and may serve to identify issues such as misconfiguration, specific Go version gaps and more. I believe that if we expose them in a new package (for example runtime/config), it will serve to help to diagnose issues better, as the runtime metrics package does.