Introducing Walmart’s L3AF Project: XDP based packet processing at scale

Karan Dalal
Walmart Global Tech Blog
6 min readAug 16, 2021

--

This is the second blog in a three-part series introducing the L3AF project that provides Kernel Function as a Service using eBPF and related technologies. You can read through the first blog here.

With the advent of XDP and eBPF, it is now possible to achieve high-performance packet processing in the kernel data path. XDP allows us to attach an eBPF program to a low-level hook inside the kernel. This XDP hook implemented by the network driver provides a programmable layer before the Linux networking stack.

When a packet arrives, the network driver executes the eBPF program in the main XDP hook. This framework allows us to execute custom eBPF programs at the earliest possible point after the packet is received from the hardware, thereby ensuring ultra-high performance. Some of the smart NICs also support offloaded XDP, which lets the program run on the NIC without using the host CPU in any way.

XDP in action

In the next couple of sections, we will discuss a few functionality gaps that can be addressed by leveraging XDP/eBPF to take direct action in the traffic path.

eBPF Load-Balancer (L4)

Several internet companies serve millions of requests every second out of their edge network. The Layer 4 Load Balancer located at critical paths in the traffic flow is taking an extremely high volume of traffic. Since the L4 LB must process every incoming packet, the solution needs to be highly performant. And also, must provide the necessary flexibility and scalability required in production environments.

Traditionally, L4 load balancers have been hardware-based primarily to suit the high-performance requirement. However, taking a hardware-centric approach limits the system’s flexibility and introduces limitations such as lack of agility, scalability, and elasticity. As applications increase in number, complexity, and importance, it is vital that the infrastructure layer is app-focused and not limited by the confines of hardware configurations.

All the performance needs can be met with software solutions themselves using eBPF. The L3AF project has leveraged Katran to develop an eBPF based load balancer offering (eLB) that is implemented using XDP in a hair-pin model on a single NIC. This load balancer redirects traffic to the backend at the device driver level using XDP_TX.

eLB — XDP based L4 Load-Balancer

One of the key features that we wanted to enable in Walmart’s production environment is DSR (Direct Server Return) so that we can send responses directly to the client. This ensures that the eLB does not need to handle return packets, which are typically larger in size.

To implement DSR, we developed an LB Agent that runs on our fleet of hypervisors in the private cloud. For other environment types, we run the agent on VMs (Virtual Machines), bare metals, etc. depending on the use case. The agent can run on any commodity hardware that is based on Linux.

When a client requests an application service, the router receives a VIP packet. The router then forwards the packet to one of the eLB nodes in the cluster through ECMP. Since all the eLB nodes announce the VIP with the same cost, they are BGP peered with the router (we run goBGP on LB nodes). When eLB receives the packet, it runs the Maglev algorithm, selects an endpoint from the set of service endpoints associated with the VIP, and encapsulates the packet using Generic UDP Encapsulation (GUE) with the outer IP header destined to the service endpoint. In our private cloud environment, encapsulation is necessary as we need to route packets to the backends that are in a different L2 domain as of the corresponding eLB node.

LB Agent to enable DSR

Application nodes receive encapsulated traffic forwarded by eLB. The lifecycle of most of our virtualized app workloads is directly managed by the individual app teams. To have a solution that is transparent to the applications, the agent is designed to run on the tap devices corresponding to the VMs. The agent decapsulates the incoming traffic and forwards it to the application. And in the return path, the agent SNATs the outgoing traffic to the VIP using connection tracking — all of which is done using eBPF (XDP and TC hooks).

The LB agent is provisioned dynamically by our control plane and, the solution is resilient to any changes at the infrastructure layer (scaling, migration, etc.)

Key Benefits are as follows:

  • eLB is helping us replace hardware-based solutions that limit the system’s flexibility with a modern software-based solution. Enabling DSR not only eliminates centralized chokepoints in our network but also helps us improve the overall site latency.
  • eLB leverages XDP which gives much better performance when compared to other software technologies such as DPDK and LVS. This coupled with DSR enables us to significantly reduce our L4 LB infrastructure footprint.

Connection and Connection Rate Limiting

As enterprises increase their digital footprint, it is vital to have safeguards put against sudden bursts of traffic or cyber-attacks. Having a connection limiting and connection rate limiting feature allows us to limit the concurrent number of TCP connections and the rate at which new TCP connections are established respectively. We also want this limit to be tuneable based on adequate benchmarking to ensure that the upstream systems are not overwhelmed.

The L3AF project has developed the eBPF/XDP programs that can perform connection and connection rate limiting.

Connection Rate Limiting Solution

Connection rate limiting uses a sliding window approach for managing the connections, which is much easier to use and understand when compared to token bucket/leaky bucket algorithms. The program only expects the “traffic rate” as input, while some of the other algorithms also need “traffic burst” as input, which is difficult to determine in complex systems. The program essentially uses the traffic patterns that it sees on the system to do the needed calculations internally.

Max connection limit uses tracepoints/kprobes to track the number of concurrent connections and gives the feedback (using BPF maps) to an XDP function that drops/resets the connections if the number exceeds the max limit configured.

Connection Limiting Solution

Key Benefits are as follows:

  • Adding the connection/rate-limiting functionality to our edge proxies and load-balancers protects our compute resources from getting overwhelmed when there is a sudden burst of traffic that is beyond what our resources are capable of handling.
  • By using XDP, we are able to drop connections at much higher rates compared to other solutions.

Both the above XDP functions and XDP based eLB, can be run in a chained fashion to work cooperatively with each other. This can ensure that all the illegitimate traffic is dropped by the connection and connection rate limit functions and eLB doesn’t have any undesired effects even under adverse conditions.

In the next blog, we discuss how to orchestrate kernel functions in the desired sequence (For example rate-limit->max-limit->eLB).

This blog is written with inputs from Ragalahari, Kanthi, and Rishabh who are engineers on the L3AF Project.

--

--

Karan Dalal
Walmart Global Tech Blog

Building traffic platforms for the world’s largest retailer, Passionate about systems engineering and reliability.