Introducing Walmart’s L3AF Project: How do we use eBPF to provide network visibility in a multi-cloud environment
This is the first blog in a three-part series introducing the L3AF project that provides Kernel Function as a Service using eBPF and related technologies.
Getting started with eBPF
Traditionally, applications that are running in userspace make system calls to access the kernel resources. But now, eBPF presents a new model that allows us to run custom sandboxed code in the kernel. With eBPF, kernel functions can be extended/customized through simple programs. These programs can be associated with desired kernel events, so they are executed whenever the event happens. To give an analogy, eBPF programs are to the kernel as to what plugins are to proxies or web servers.
Let us look at this in a little bit more detail to see how eBPF makes this possible. eBPF runs as a mini-VM inside the kernel. This mini-VM provides a sandboxed environment that has out-of-the-box integrations with low-level network hooks such as XDP/TC as well as probing mechanisms such as kprobes, uprobes, and tracepoints. With this architecture, it is now possible to write efficient eBPF programs and run them in the kernel. eBPF kernel programs are written in C and compiled to eBPF bytecode.
eBPF also provides a safe and secure way to do all of this inside the kernel. The verification step ensures that the eBPF program is safe to run. It validates that the program meets several conditions, for example, it makes sure that the program does not crash and that it always runs to completion (w/o sitting in infinite loops). The Just-in-Time (JIT) compilation step translates the generic bytecode of the program into the machine-specific instruction set to optimize the execution speed of the program. This makes eBPF programs run as efficiently as natively compiled kernel code.
For more details on eBPF, please refer https://ebpf.io/
Walmart Global Tech is developing some of the most cutting-edge products in the realm of eBPF under a project called L3AF. L3AF provides eBPF based networking and observability solutions with the help of an advanced control plane written in Go.
In the realm of networking, L3AF enables Kernel Function as a Service by providing complete lifecycle management of eBPF programs that instrument, inspect, and interdict traffic. These eBPF programs use low-level network hooks such as XDP and TC to give us an ultra-high performance programmable network data plane that executes prior to the higher and slower layers of the Linux networking stack.
On the observability side, L3AF provides a list of curated metrics by collecting and aggregating custom information generated at the source of the event in the kernel. These metrics provide detailed insight about cluster/node utilization and downstream/upstream network performance as well as traffic distribution across multiple clouds. L3AF provides deeper visibility into the system performance when compared with other programs, which rely on static counters and gauges exposed by the operating system (like /proc). L3AF also offers out-of-the-box integration with Prometheus by maintaining full compatibility, including support for Prom QL.
In this blog series, we focus on L3AF’s networking solutions, which enable Kernel Function as a Service (KFaaS).
Network Landscape, Challenges, and Solutions
As enterprises adopt a hybrid cloud strategy and migrate workloads from private to the public cloud, a few interesting opportunities become apparent:
- There is an immediate need to get the same level of network visibility in the public cloud as in the private cloud.
- A hybrid cloud environment requires greater control over traffic due to the limitations and costs associated with networks.
- It is essential to replace hardware-based packet processing solutions in private clouds with software-based solutions to achieve feature parity with public clouds as enterprises move towards symmetric deployment.
Let’s deep dive into some of these functionality gaps and their eBPF based solutions:
Traffic Flow Logs
As enterprises start serving live traffic out of public clouds, it is increasingly important for them to export Traffic Flow data to security solutions that provide advanced threat protection across the extended network and cloud.
Private clouds provide Traffic Flow data through dedicated network appliances (hardware-based solutions). However, tenants in public clouds do not enjoy a similar level of access/network visibility since the infrastructure layer is shared. We considered options to address this, such as adding a network hop to process Traffic Flow data (using NetFlow protocol). Such a configuration increases traffic latency, and also adds another layer to manage in the traffic ingress stack.
As the best solution, L3AF project developed an eBPF program (Kernel Function), which extracts and exports flow metadata directly from Linux-based Edge Proxy servers.
As shown in the diagram, the eBPF Kernel Function (TC) retrieves flow attributes from every ingress and egress packet and updates them in the eBPF map as flow records. eBPF maps are generic key-value storage of various types that can share data between user and kernel space. Flow record maps contain a 5-tuple (sa, da, sp, dp, proto) as key, and other flow attributes like packet/byte counters, last seen, TCP flags, etc. as values.
The flow exporter in userspace reads flow records periodically and calculates flow statistics since the last poll. The delta in counters between consecutive reads is calculated by preserving the flow state (after every read) in another last record map. This allows us to accurately track the active flows.
Once the necessary flow attributes are in place, an IPFIX message is created, as per RFC, and exported to any security threat detection and analysis tool.
Key Benefits are as follows:
- In addition to saving infrastructure costs, L3AF’s solution significantly lowers the overall latency of traffic by eliminating the need to pass through an additional network hop.
- This solution uses a more flexible open-source IPFIX protocol instead of NetFlow. Unlike NetFlow, IPFIX allows the export of variable length custom fields (such as URL) that provides enhanced visibility.
The best way to succeed in a business is by providing an amazing customer experience. The quality of the overall experience is often what influences customers when they shop online. At Walmart, we want to have visibility into how our customers are interacting with our site.
We have a few analytics solutions that can operate on the data streams and provide the needed analysis. But these solutions need the data of interest and that interest changes from time to time. There is an opportunity to save valuable time and money by automating the process of collecting this data.
One of the most effective ways of collecting this data of interest in the public cloud is from the edge proxy servers. However, it is also a critical hop that handles all of the ingress traffic to the site and is performance sensitive.
So, we started exploring some of the commercial solutions, a few of which are listed here:
- Running a stand-alone agent that would mirror 100% of traffic on the proxy VMs. However, this would incur:
1. Significant traffic expenses as we would mirror 100% data.
2. Additional licensing cost ($ / NIC that we decide to mirror).
3. Overhead on the resources of the host.
- Using traffic mirroring services that are offered natively by the public cloud. However, this isn’t a consistent solution as many flavors of the public cloud either do not offer this solution or do not offer the necessary capability to filter the data of interest.
To overcome these limitations, the L3AF project developed an eBPF based software solution that encapsulates the filtering and mirroring functionalities together. This solution supports one or many custom filters in 5-tuple (sa, da, sp, dp, proto) and also allows us to capture only header data (if required), thereby limiting the bandwidth utilization.
Additionally, given that eBPF is very lightweight, highly performant, and safe, this solution has been implemented at the source (i.e., on the edge proxy). So, on the edge proxy, we attach the mirroring function to the primary NIC that processes the actual traffic. This solution examines every incoming/outgoing packet using the TC hook and matches it against the filter (5-tuple based). If the match is successful, it clones the packet and redirects to a secondary NIC that forwards traffic to the analytics systems on a GUE tunnel.
Key Benefits are as follows:
- Unlike the hardware-based solutions (taps) primarily used in the private cloud, we achieved the use-case through a software-based solution. This software-based solution runs directly on the host thus, eliminating an additional point of hardware failure and saving on infrastructure cost.
- Traditional cloud solutions mirror 100% of traffic to an aggregation layer that sits in a centralized cloud. So, there is an additional traffic cost involved in sending the traffic and receiving optimized traffic back into our cloud subscription. The customized filtering capability allows us to mirror only traffic we care about to the analytics system.
- Since eBPF solutions can be dynamically programmed to attach/detach, they provide the flexibility to selectively choose the applications/domains that we want visibility into, on the fly, in a seamless manner.
We have reviewed a few network visibility use-cases in this blog post. Our next blog in this three-part series will focus on how L3AF uses XDP to provide high-performance networking solutions at scale.
This blog is written with inputs from Ragalahari and Kanthi, who are engineers on the L3AF Project.