iptables vs. GoXDP: The Ultimate Packet Filtering Benchmark Setup and Results

Ali Hussein Safar
7 min readMar 23, 2024

--

The Linux netfilter module enables user-space applications to register the processing rules applied to the packets by the Linux kernel network stack. This enables efficient network forwarding and filtering. Iptables is one of the most common firewall tools used to filter packets that utilize the Linux Netfilter for packet processing. However, the big limitation in the iptables architecture is the matching process of the packets. For instance, if the INPUT chain of the filter table has 5K rules in this case every single packet will be checked against these rules until the first match is found and this introduces unnecessary processing by the CPU because of a huge number of rules that likely most of them will not match until the first match found. Even after the performance gain of using ipset, it cannot cope with the huge number of packets received in DDoS attacks. Therefore, the solution is to build firewall tools using Linux XDP (like GoXDP or xdp-filters). This blog will describe the benchmark environment setup and the results of the Linux firewall tools (iptables and GoXDP) to determine the maximum number of packets that can be dropped per second when CPU utilization reaches 100%.

Environment Setup

The benchmark environment will consist of 3 physical servers each one of them with the following resources

1- Single CPU socket Intel(R) Xeon(R) Silver 4210 CPU @ 2.20GHz 10 Cores 20 Threads (Hyper-Threading is enabled).
2- Two Intel Ethernet 10Gb 2P X520 adapters configured as a bond interface (maximum traffic is 20Gb).
3- Single DDR4 Memory 32GB 2666 MHz.

The following figure describes the network connectivity:

The figure describes a typical setup where services and firewall are running on node 1 and DDoS traffic is generated by multiple sources in this case node 2,3 will generate DDoS UDP traffic.

Implementation

To create enough traffic so that the CPU utilization of node 1 reaches 100%, a large number of nodes is required to generate enough traffic and since there are only two nodes that will generate the traffic and mimic a DDoS attack, it is more reasonable to get the number of dropped packets per second when the CPU utilization of only single core reaches 100%. Moreover, this can be achieved using Receive Side Scaling (RSS) and Intel Ethernet Flow Director.

Receive Side Scaling (RSS)

When the network card receives a packet, it will be forwarded to one of the Rx queues to be transferred to the memory using Direct Memory Access (DMA) and then processed by the corresponding CPU core. RSS is a technique that is responsible for determining which Rx queue the packet will be forwarded to. RSS is implemented by using a hash table that resides in the NIC’s hardware and the key of the hash function (Toeplitz hash function) which consists of the request source IP address, destination IP address, source port, and destination port. According to the result of the hash function, the packet will be forwarded to one of the Rx queues. Following this approach, the NIC can approximately distribute the traffic evenly to the RX queues and allow it to scale linearly by only increasing the number of RX queues. Some advanced network cards support RSS filters, allowing to write rules to direct traffic to specific Rx queues. Intel Ethernet Flow Director is a feature that is supported by most advanced Intel NICs which uses RSS filters to steer packets to Rx queues based on predefined rules. For instance, packets with a specific source IP address can be forwarded to a specific Rx queue.

Configuration

The following steps will describe the required configuration of node 1:

1- Check if the Intel Ethernet Flow Director is enabled using the following command:

[root@netfilter1 ~]# ethtool --show-features ens2f0 | grep ntuple
ntuple-filters: on
[root@netfilter1 ~]# ethtool --show-features ens2f1 | grep ntuple
ntuple-filters: on

Note: If the ntuple-filters feature is followed by off or on, Intel Ethernet FD is supported by the network adapter. However, if the ntuple-filters feature is followed by off [fixed], then Intel Ethernet FD is not supported. The following command can be used to enable it:

ethtool --features <ifName> ntuple-filters on

2- Redirect all the UDP packets from node 2,3 to Rx queue number 1:

[root@netfilter1 ~]# ethtool --config-ntuple ens2f0  flow-type udp4 src-ip 100.2.2.5 action 1
Added rule with ID 2045
[root@netfilter1 ~]# ethtool --config-ntuple ens2f0 flow-type udp4 src-ip 100.2.2.6 action 1
Added rule with ID 2044
[root@netfilter1 ~]# ethtool --config-ntuple ens2f1 flow-type udp4 src-ip 100.2.2.5 action 1
Added rule with ID 2045
[root@netfilter1 ~]# ethtool --config-ntuple ens2f1 flow-type udp4 src-ip 100.2.2.6 action 1
Added rule with ID 2044

Making sure that rules are correctly added for both network interfaces

[root@netfilter1 ~]# ethtool --show-ntuple ens2f0
20 RX rings available
Total 2 rules
Filter: 2044
Rule Type: UDP over IPv4
Src IP addr: 100.2.2.6 mask: 0.0.0.0
Dest IP addr: 0.0.0.0 mask: 255.255.255.255
TOS: 0x0 mask: 0xff
Src port: 0 mask: 0xffff
Dest port: 0 mask: 0xffff
VLAN EtherType: 0x0 mask: 0xffff
VLAN: 0x0 mask: 0xffff
User-defined: 0x0 mask: 0xffffffffffffffff
Action: Direct to queue 1

Filter: 2045
Rule Type: UDP over IPv4
Src IP addr: 100.2.2.5 mask: 0.0.0.0
Dest IP addr: 0.0.0.0 mask: 255.255.255.255
TOS: 0x0 mask: 0xff
Src port: 0 mask: 0xffff
Dest port: 0 mask: 0xffff
VLAN EtherType: 0x0 mask: 0xffff
VLAN: 0x0 mask: 0xffff
User-defined: 0x0 mask: 0xffffffffffffffff
Action: Direct to queue 1

3- Stop the IRQ balance service to manually specify which CPU core handles the Rx queue 1 packets.

systemctl stop irqbalance.service

4- The set_irq_affinity.sh script can be used to map each Rx queue to the corresponding CPU core automatically:

./set_irq_affinity.sh all ens2f0
./set_irq_affinity.sh all ens2f1

The set_irq_affinity script will make sure that every single CPU core will process the packets from a single Rx queue of the network card. This means CPU core 1 will respond to the interrupt requests from the Rx queue 1 of both network cards (ens2f0,ens2f1). Therefore, to ensure the configuration is correct, The following command can be used to get the interrupt numbers of the network card queues

[root@netfilter1 ~]# cat /proc/interrupts | grep 'ens2f.-TxRx' | awk '{ print $1,$NF }'
61: ens2f0-TxRx-0
62: ens2f0-TxRx-1
63: ens2f0-TxRx-2
64: ens2f0-TxRx-3
65: ens2f0-TxRx-4
66: ens2f0-TxRx-5
67: ens2f0-TxRx-6
68: ens2f0-TxRx-7
69: ens2f0-TxRx-8
70: ens2f0-TxRx-9
71: ens2f0-TxRx-10
72: ens2f0-TxRx-11
73: ens2f0-TxRx-12
74: ens2f0-TxRx-13
75: ens2f0-TxRx-14
76: ens2f0-TxRx-15
77: ens2f0-TxRx-16
78: ens2f0-TxRx-17
79: ens2f0-TxRx-18
80: ens2f0-TxRx-19
83: ens2f1-TxRx-0
84: ens2f1-TxRx-1
85: ens2f1-TxRx-2
86: ens2f1-TxRx-3
87: ens2f1-TxRx-4
88: ens2f1-TxRx-5
89: ens2f1-TxRx-6
90: ens2f1-TxRx-7
91: ens2f1-TxRx-8
92: ens2f1-TxRx-9
93: ens2f1-TxRx-10
94: ens2f1-TxRx-11
95: ens2f1-TxRx-12
96: ens2f1-TxRx-13
97: ens2f1-TxRx-14
98: ens2f1-TxRx-15
99: ens2f1-TxRx-16
100: ens2f1-TxRx-17
101: ens2f1-TxRx-18
102: ens2f1-TxRx-19

Since interrupts start from 61 to 102, the following command is used to get the SMP affinity of the interrupts:

[root@netfilter1 ~]# for i in {61..102}; do cat /proc/irq/$i/smp_affinity;done
00001
00002
00004
00008
00010
00020
00040
00080
00100
00200
00400
00800
01000
02000
04000
08000
10000
20000
40000
80000
fffff //this can be ignored because interrupt 81 it does not belogs to the network card
cat: /proc/irq/82/smp_affinity: No such file or directory //this can be ignored
00001
00002
00004
00008
00010
00020
00040
00080
00100
00200
00400
00800
01000
02000
04000
08000
10000
20000
40000
80000

From the output of the command, the SMP affinity (interrupt 62 and 84) of the Rx queue 1 of both interfaces have the value of 00002 and when converted to binary will yield “0000 0000 0000 0000 0010” which means that CPU core 1 will be responsible for handling the interrupts of Rx queue 1 of both interfaces (ens2f0,ens2f1).

It is crucial to mention that the same CPU core that is responsible for handling the interrupt requests of the Rx queue, will be used to process the packet in the Linux kernel’s network stack.

For node 2,3 there is no special configuration is required, just the hping3 command needs to be installed and used as follows:

timeout 30 hping3 --flood --destport 53 --udp 100.2.2.4

Results

The following table includes the benchmark results

The iptables input chain could not cope with the huge number of received packets even if there are only two rules in the input chain. Accordingly, over 9 million packets were overwritten at the network card Rx queue or the receive ring buffer. On the other hand, GoXDP with native mode could achieve high performance with only 15% CPU utilization during the benchmark which can be ideal for DDoS mitigation systems.

Conclusion

Linux iptables can be ideal for packet processing and filtering but it can achieve poor performance on heavy load, especially in case of DDoS attacks. Therefore, Linux XDP introduces a new approach for packet filtering by processing packets at the network card’s driver (native mode) and before sk_buff is created which provides huge performance gain.

Resources

https://github.com/ahsifer/goxdp
https://www.kernel.org/doc/Documentation/networking/scaling.txt
https://www.intel.com/content/www/us/en/developer/articles/training/setting-up-intel-ethernet-flow-director.html
https://www.tigera.io/learn/guides/ebpf/ebpf-xdp/
https://www.flaticon.com
https://medium.com/@ahsifer/introducing-goxdp-utilizing-the-power-of-xdp-for-advanced-linux-firewall-07de694310da

Thank you for reading my article. I hope you find it informative and helpful. If you enjoy the article and would like to support my work, follow me or you can buy me a coffee at https://www.buymeacoffee.com/ahsifer. Your support is greatly appreciated

--

--