Can you run Tetragon on HashiCorp Nomad? — Part 1

Of course you can! And I’m going to show you how. This guide will be presented in two parts to ensure comprehensive coverage. Part one will focus on setting up Tetragon on Nomad. In Part two, the spotlight will be on Tracing Policies.

Glen Yu
4 min readJun 13, 2024
Peanut butter & jelly?

Tetragon

While Tetragon has gained recognition within the Kubernetes community, its core functionality transcends that specific environment. It offers tools like Custom Resource Definitions (CRDs) and APIs to simplify deploying and managing Tracing Policies within Kubernetes, but its true nature is better described as “Kubernetes-aware.” This means Tetragon isn’t limited to Kubernetes. You can leverage it in other settings like Docker, run it independently as a systemd service, or even use it with Nomad, another popular workload orchestrator.

Source: tetragon.io

Nomad

Nomad, a versatile workload orchestrator, has always impressed me with its ability to manage diverse workloads. Beyond containerized applications, Nomad can handle various task types as long as you have the corresponding binary or executable (referred to as Task Drivers in Nomad). In fact, I’ve written several articles exploring Nomad’s flexibility and user-friendliness. If you’re interested in learning more, feel free to check them out!

Setup requirements

There are a few notable settings in the Nomad client’s configuration that is required for this to work. I have included them in my Packer & GCP with GitHub Actions repo:

  1. Enable privileged Docker jobs
  2. Add host volume mount for BTF (read-only)
  3. Add host volume mount for Tetragon JSON log path (read-write)

Additional (optional/recommended) setup requirements

The Override action in Tracing Policies (more on Tracing Policies in Part 2) requires a kernel compiled with the CONFIG_BPF_KPROBE_OVERRIDE option. This option is not currently available on Google Compute Engine VMs, and its presence may vary across cloud providers. Unfortunately, in public cloud environments, you typically lack control over kernel configuration. However, if you manage your own datacenter and have kernel customization freedom, enabling CONFIG_BPF_KPROBE_OVERRIDE during compilation is recommended.

Tetragon deployment

The following Nomad job specification (jobspec) defines a Tetragon deployment similar to its Kubernetes counterpart. It includes an agent container responsible for the core Tetragon functionality. A separate container, named export-stdout, continuously reads logs from the agent container's standard output (/var/log/tetragon/tetragon.log) and outputs them to stdout.

There’s a lot to unpack, so let me break things down for you:

  • type = system is the equivalent of a daemonset in Kubernetes. Nomad will ensure there is one instance running per Nomad client
  • group is the equivalent of a pod in Kubernetes while task is the equivalent of a container. Here we are running two containers: the aforementioned agent and the export-stdout
  • volume_mount blocks reference the volume blocks, and the source name is not arbitrary — it is defined in the Nomad client config as these are are host volume mounts (see Setup requirements above). One of the volume mounts is for Tetragon’s logs so that it can persist

Below is an image of the Nomad UI showing the stdout of the export-stdout container:

Nomad UI showing export-stdout container’s logs

Dude, where’s my networking?

You may notice I did not specify a network block in the Tetragon jobspec. This is because unlike its Kubernetes counterpart, where pods within a cluster share a pod-specific CIDR range, each Tetragon agent is isolated to the Nomad client it is deployed in (i.e. standalone mode). Hence there are no ports that needs exposing; it just works directly with the kernel probes (kprobes), which you will learn about in more detail in Part 2.

Additional Tetragon on Nomad caveats

While Tetragon functions in standalone mode on VMs, some advanced features leverage Kubernetes concepts like pods and namespaces for event filtering. However, Nomad’s equivalent concepts are not recognized and hence you will not be able to filter your events by pod/group name or by namespace.

Nomad, as a versatile workload orchestrator, manages various tasks through task drivers (e.g., Java, QEMU). It also includes built-in health and version checks for these drivers, which will generate event entries visible to Tetragon. While these entries might not be directly relevant to Tetragon, they are a normal part of Nomad’s operation, but unfortunately will be noise in your logs.

Below is an example of the events output using the Tetragon CLI, tetra from within the Tetragon agent container:

Sample events output from tetra CLI

What’s next?

I’m currently working on ways to expand Tetragon’s functionality within the Nomad environment. The goal is to achieve a user experience comparable to Kubernetes’ version.

Part 2 of this guide (arriving next week) will delve into the power of Tracing Policies. You will learn how to leverage them to enforce runtime security within your Nomad workloads.

EDIT 2024–06–20: Part 2 of my guide can be found here!

--

--

Glen Yu
Glen Yu

Written by Glen Yu

I'm a Google Developer Expert (Cloud) and an ML/AI enthusiast!

No responses yet