Episode-XVII [BKeeper]

Fatih Nar
Open 5G HyperCore
Published in
8 min readSep 1, 2023

Authors: [Fatih E. NAR, Dave Tucker, Arkady Kanevsky at Red Hat], [Dave Cremins, Vivek Kashyap, John Browne at Intel]

DALLe Beekeeper with Hives on the Boat

Update [December 2023]: bpfd project renamed to bpfman to have clear use and self-explanation. Red Hat has initiated a donation under CNCF Sandbox initiative.

1.0 Introduction

In “Episode XVI: The Eye of the BEE-Holder,” we delved into potential data sources and their augmentation for comprehensive observability solutions bolstered by AIOps capabilities.

One of the standout methods for enhancing observability, as highlighted in Episode XVI Section 3.3, is the utilization of eBPF probes and applications. Through our journey of designing eBPF applications and critically assessing others, we discerned that they could occasionally exert profound impacts on the core framework of the application platform, particularly Kubernetes (K8s) and its interlinked services. In some cases, these core K8s services teetered on the brink of dysfunction or became overshadowed by dominant eBPF loaders exerting authority across nodes.

This article dissects the intricacies of deploying eBPF programs for observability.

Our aim? We are glad you asked 🙂; We would like to ensure optimal performance without compromising the stability and integrity of the application platform or jeopardizing the applications it hosts, while leveraging eBPF technology.

2.0 Background

The eBPF (Extended Berkeley Packet Filter) functionality in the Linux kernel is immensely powerful, and the `CAP_BPF` capability provides access to a significant subset of these functionalities. The `CAP_BPF` offers access to great capabilities, in return for potent permissions, such as:

1. Kernel Observability: eBPF allows for the observation of kernel internals. A process with the appropriate permissions can attach eBPF programs to various kernel hooks, tracing system calls, network packets, scheduler events, and more. This can reveal sensitive information about system operations.

2. Kernel Behavior Modification: eBPF programs can modify the kernel’s behavior beyond mere observation. For example, an eBPF program can alter or block network packets. In the hands of a malicious entity, this can be misused for nefarious purposes like data tampering.

3. Access to eBPF Maps: eBPF maps are key-value stores in kernel space with which eBPF programs can interact. A process with `CAP_BPF` can read from and write to these maps, potentially changing data or behavior defined by other eBPF programs.

Figure-1 eBPF deployment(s) in Kubernetes with chaos

4. Unrestricted Program Loading: Without additional controls, a process with `CAP_BPF` can load and attach eBPF programs to many kernel hooks. There isn’t an inherent mechanism to restrict which eBPF programs can be loaded or which hooks can be attached.

5. Potential for Resource Exhaustion: eBPF programs and maps consume system resources. A process that can freely load and manage eBPF entities could potentially exhaust system (hardware and/ or software) resources, leading to a Denial of Service (DoS) situation.

6. Indirect Access to Other Capabilities: Loading some eBPF program types can require other capabilities. For instance, attaching an XDP or TC eBPF program to a network interface would need an application to be granted `CAP_NET_ADMIN`.

The scope of these capabilities is much broader than just `CAP_BPF`, which goes against the principle of least privilege.

Given these powers, `CAP_BPF` should be considered equivalent to full root privileges and handled & used carefully.

3.0 Wrong Way [Chaos of Bees]

The challenges & disruptive impacts of using external (i.e., non-regulated) eBPF loaders:

[I] Privileged access requirements:

[I.A] eBPF loaders necessitate privileged pods.

[I.B] eBPF-enabled apps mandate, at the least,
CAP_BPF permissions (details described in the previous section),
with the possibility of additional permissions based on
the specific program type.

[I.C] Linux capabilities are expansive,
making it tricky to limit a pod to the most essential privileges.
This vastness can lead to inadvertent or deliberate system compromises
(such as container escape cases).
Figure-2 DaemonSet to Daemon Acts (Link1, Link2, Link3)

[II] Handling Multiple eBPF Programs:

[II.A] Not every eBPF hook supports numerous programs simultaneously.

[II.B] Specific software operating with eBPF may assume exclusive
access to an eBPF hook, potentially leading to the unintended
displacement of existing programs upon attachment,
resulting in unpredictable or silent failures.
Figure-3 BPF Kernel Datapath Revamped Incident Reporting (Link3)

[III] Multi-Tenancy Problem:

[III.A] eBPF lacks namespacing, allowing privileged access to 
all kernel-stored information. Traditionally, there were no ACLs
or RBAC controls to limit the loading of specific probe types.

[IV] Debugging Deployment Challenges:

[IV.A] Cluster administrators must be made aware of using eBPF programs 
within a cluster.

[IV.B] Interactions between various eBPF programs can lead to
unexpected complications (example given II.B above).

[IV.C] SSH access or a privileged pod becomes indispensable
to ascertain the status of eBPF programs on each node
within the cluster.

[V] Lifecycle Management Complexities:

[V.A] While userspace libraries 
(such as libbpf, bcc, bpftool, pyebpf, bpftrace etc.)
support basic eBPF program loading and unloading,
supplementary coding is often essential for holistic lifecycle
management and continuous impact management.

[V.B] Deployment within Kubernetes is more complex.
It demands the development of a daemon to load the eBPF bytecode,
subsequently deploying it via a DaemonSet.
This process necessitates a deep understanding of the eBPF program
lifecycle to ensure consistent program loading and
seamless handling of pod restarts, upgrades, and movement.

[VI] Version Control Limitations:

[VI.A] In current eBPF-enabled K8s deployments, 
the eBPF Program is typically integrated into the userspace binary,
which manages and communicates with it.

This integration hinders the ability to maintain precise versioning
control of the bpfProgram in relation to its corresponding
userspace component.

[VII] Performance considerations

[VII.A] Since probes run in the kernel space, 
they have a lot of powers for good and evil.
With so much power comes heavy responsibilities.
You want probes to be very fast and small.
Just read what you need to write quickly into a map,
avoiding expensive processing at all costs!

[VII.B] Poorly performing probes can impact all areas of the system.
For example, a user could load an uprobe to trace
`free` and `malloc` in libc adding a context switch
to the kernel (to execute the probe) at a very high frequency,
causing the performance of all programs that link to libc to suffer.

4.0 The Kubernetes Native Way [Right-Way, Organized Beehive]

bpfman is an innovative software stack designed to simplify the loading, unloading, modification, and monitoring of eBPF programs, whether on an individual host or across a Kubernetes cluster. It’s developed in Rust and built upon the Aya eBPF library.

Figure-4 eBPF Deployment(s) in Kubernetes the Right Way

Core Components:

[1] bpfman Daemon:

bpfman daemon is asystem-wide service enabling the loading, 
unloading, modification, and monitoring of eBPF programs,
accessible through a gRPC API.

[2] eBPF CRDs:

[2.1] Offers a collection of Custom Resource Definitions (CRDs), 
such as XdpProgram and TcProgram, expressing the intent
to load specific eBPF programs.

[2.2] Features a bpfman-generated CRD (BpfProgram) that
depicts the runtime state of loaded programs.

[3] bpfman-agent:

bpfman-agent operates within a container in the bpfman daemonset, 
this agent ensures that the desired eBPF programs are loaded
as required for each node.

[4] bpfman-operator:

Leveraging the Operator SDK, this operator oversees the installation 
and lifecycle of both the bpfman-agent and associated CRDs within
a Kubernetes environment.
Figure-5 bpfman Architecture

Critical benefits of having a central, authoritative eBPF loader (single pane solution); bpfman:

Security:

> Maintaining stringent oversight of the bpfman daemon ensures it exclusively 
possesses the necessary privileges to load eBPF programs,
thereby preserving the unprivileged status of user pods.
While we're committed to minimizing the spread of CAP_BPF privileged
within pods, we anticipate that it might never be entirely restricted.
Nevertheless, our aim/recommendation is to alert cluster administrators
about any "rogue" bpf programs in the k8s cluster.

> Administrators gain increased control over program loading and can set
rules for the eBPF program networking sequence.

> Facilitating bytecode ownership verification through container
image signing.

> API access is managed through standard RBAC methods,
restricting which eBPF features and pods can use hooks.
This lets you define which programs to load and sets the stage
for future developments. Soon, bpfman aims to provide refined control,
such as inserting uprobes into " myapp " processes
within pods labeled "myapp" while limiting access to
specific functions.

>As part of our commitment to ensuring the security of the
code bpfman loads, we are excited about our upcoming integration
with Sigstore/Rekor, which will vouch for the authenticity
of the eBPF code.

Visibility & Debuggability:

> Enhanced insight into active eBPF programs on a system, 
optimizing debuggability for various stakeholders.

> It offers a comprehensive view of all eBPF programs
across cluster nodes.

> Can assist with collecting performance data from
eBPF programs to help debug system performance/instability issues.

Multi-program Support:

> Enables multiple eBPF programs from various users 
to coexist harmoniously.

> Utilizes the libxdp multiprog protocol for allowing
multiple XDP programs on a single interface, with similar support
for TC programs.
Figure-4 Bpfman Operator

Productivity:

> Streamlines eBPF program deployment and management within a 
Kubernetes setting.

> Developers can focus on core tasks, leaving program lifecycle
concerns like loading, attaching, and pin management to bpfman.

> Supports using libraries like Cilium, libbpf, and Aya
for eBPF development, but with loading/unloading streamlined by bpfman.

> Introduces eBPF Bytecode Image Specifications for intricate
versioning control of userspace and kernelspace programs.

5.0 Conclusion

While the potential pitfalls and challenges of eBPF deployment are undeniable, as explored in this article’s initial sections, a path exists that allows for the seamless integration of eBPF into K8s platforms. With tools like bpfman at our disposal, we can craft a solution that keeps the delicate balance within K8s application platform intact.

The juxtaposition of the “wrong way” versus the “right way” elucidated that while the journey of eBPF integration is riddled with complexities, the destination — when approached with the right tools and perspective — can be one of security, visibility and enhanced productivity.

By implementing & using bpfman, we’re not just addressing the challenges head-on; we’re moving towards an era where eBPF management becomes fluid and adaptable without breaking the integrity of application platforms and/or impacting overall performance.

Understanding eBPF within Kubernetes demands respect for its power, nuances, and the tools and techniques to harness its full potential. Through this article, we hope to have illuminated that path, taking one step closer to achieving harmony of performance and stability.

--

--