Kernel Observability with OpenTelemetry — eBPF

Emre Yalvac
hepsiburadatech
Published in
5 min readOct 30, 2023

Kernel observability has rapidly become a cornerstone of efficient systems monitoring, and with the integration of OpenTelemetry and eBPF, developers are now equipped with more powerful tools than ever before. Let’s delve into how these technologies are revolutionizing the way we understand and optimize our systems.

What is OpenTelemetry?

OpenTelemetry is a brainchild of the CNCF (Cloud Native Computing Foundation), which is also behind Kubernetes. It is a standard toolset (SDK) for observing how services or infrastructures operate. It encompasses the capabilities to collect and transmit traces, metrics, and logs of distributed systems and microservices. This provides developers and DevOps with in-depth information about services, allowing them to better understand their interactions.

eBPF?

eBPF (extended Berkeley Packet Filter) was originally released in Kernel version 3.18 as a technology used for packet filtering in Linux, but over time, it has gained many more capabilities. Currently, eBPF is used in the Linux kernel for a variety of observations and enhancements such as networking, debugging, and tracing. In layman’s terms, it allows you to hook into syscalls in the kernel.

For more detailed information:
https://www.youtube.com/watch?v=WgFH7bzpRvo

What are the differences?

OpenTelemetry provides a standard infrastructure for performance and error tracking at the application level in distributed systems. eBPF allows for dynamic monitoring and observation in the Linux kernel.

How can they be combined?

With OpenTelemetry, you can obtain HTTP calls and, by combining them with network events and actions like syscalls through eBPF, you can achieve a more in-depth performance and error observation. Many scenarios can be generated in this context (Anomaly detection, debugging, etc.).

When it comes to telemetry standards, OpenTelemetry reigns supreme. Despite the myriad of telemetry collection techniques, OpenTelemetry carves out the benchmark in observability.

For instance, we can harness eBPF to muster telemetry in a format tailored for OpenTelemetry. This capability broadens the spectrum of data OpenTelemetry can capture, accommodating a plethora of applications and intricate kernel telemetry.

What is the purpose of opentelemetry-ebpf package?

The primary aim of this package is to amass metrics from various components, be it operating systems, cloud infrastructures, or container ecosystems. With the diverse environments and platforms today’s applications run on, having a consolidated tool that can pull metrics from these different sources is indispensable.

However, raw metrics, while informative, can be overwhelming and challenging to decipher. Herein lies the package’s main strength. It is adept at converting these raw metrics into a more digestible format that adheres to the OpenTelemetry standard. This transformation ensures that the data is both actionable and consistent with other telemetry data, thereby simplifying analysis and troubleshooting.

Let’s talk about the components of the package:

kernel-collector

This component is essential for tapping into the heartbeat of your system. It leverages eBPF to delve deep into the Linux Kernel, collecting crucial low-level metrics that give insights into system performance, resource usage, and other kernel-specific activities.

k8s-collector

As implied by its name, this component specializes in the realm of Kubernetes. It fetches metrics pertinent to Kubernetes by seamlessly interfacing with Kubernetes APIs. This ensures that you have a comprehensive view of your container orchestration environment.

cloud-collector

As businesses migrate to cloud infrastructures, monitoring cloud resources becomes vital. The cloud-collector is designed to gather metrics straight from your cloud service provider, ensuring that you’re always informed about your cloud-based assets’ state. At present, integrations with prominent providers like AWS and GCP are available.

reducer

After collecting a wealth of data, it’s essential to make it coherent and actionable. The reducer steps in here, transforming raw metrics from various collectors into a standardized format compliant with OpenTelemetry. This ensures that the data is both consistent and ready for analysis, irrespective of its source.

Let’s compile and use reducer&collector

First, we need to pull and tag the image containing all the packages required for the build

docker pull quay.io/splunko11ytest/network-explorer-debug/build-env
docker tag quay.io/splunko11ytest/network-explorer-debug/build-env build-env:latest

After that, we need to clone the repository where the package was released and fetch its submodules

git clone https://github.com/open-telemetry/opentelemetry-ebpf.git
cd opentelemetry-ebpf
git submodule update --init --recursive

We create a folder outside the main directory for the compile output

mkdir -p ../build

Finally, we mount and run the image we pulled

docker run -it --rm \
--env EBPF_NET_SRC_ROOT=/root/src \
--mount type=bind,source=$PWD,destination=/root/src,readonly \
--mount type=bind,source=$PWD/../build,destination=/root/out \
--mount type=bind,source=/var/run/docker.sock,destination=/var/run/docker.sock \
--name benv \
build-env

With this image, we now have all the necessary dependencies at hand. To compile the package, we run the build.sh that comes with the image

./build.sh --help

This command will show you how the reducer and collector packages, which we mentioned above, will be compiled. We are currently compiling the entire package and moving on

./build.sh --cmake
cd out
make render_compiler

Finally, we create binaries that we can execute with ‘make’.
Now, we have a folder named ‘out’ and inside it, there are executables for reducer & collector.

For example, we will do a small demonstration using the reducer and k8s collector pair.

cd out/reducer
./reducer --log-console

This command will run on port 8000 by default and will write the metrics sent from the collector directly to the console.

At this point, instead of the console, we can also export the metrics to Prometheus

reducer --prom=0.0.0.0:7010
scrape_configs:
- job_name: 'opentelemetry-ebpf-reducer'
static_configs:
- targets:
- '192.168.0.101:7010'

Additionally, we can direct these metrics straight to a real opentelemetry-collector

reducer --disable-prometheus-metrics --enable-otlp-grpc-metrics --otlp-grpc-metrics-host=192.168.0.212

After getting the reducer up and running, we can now start the collectors to initiate the data flow

cd out/collector/k8s
./k8s-relay

The k8s-relay binary will send the general and pod metrics from the current k8s you are on to the reducer. The same setup applies for the kernel and cloud collectors as well.

Many values are not displayed due to the sensitive information they contain.

Rosetta

The image mentioned above is an image that runs on the x86 architecture. When working with this image on macOS, you need to use the translation layer between architectures developed by Apple, called Rosetta. Otherwise, you won’t be able to use eBPF and build tools.

softwareupdate --install-rosetta --agree-to-license

--

--