100Gbps Network DPI, Content Extraction on Xilinx’s FPGA
Deep packet inspection (DPI) is an advanced method of examining and managing network traffic. It is a form of packet filtering that locates, identifies, classifies, reroutes, or blocks packets with specific data or code payloads that conventional packet filtering, which examines only packet headers, cannot detect.
DPI combines the functionality of an intrusion detection system (IDS) and an Intrusion prevention system (IPS) with a traditional stateful firewall. This combination makes it possible to detect certain attacks that neither the IDS/IPS nor the stateful firewall can catch on their own. Stateful firewalls, while able to see the beginning and end of a packet flow, cannot catch events on their own that would be out of bounds for a particular application. While IDSs can detect intrusions, they have a minimal capability in blocking such an attack. DPIs are used to prevent attacks from viruses and worms at wire speeds. More specifically, DPI can be effective against buffer overflow attacks, denial-of-service attacks (DoS), sophisticated intrusions, and a small percentage of worms that fit within a single packet.
DPI-enabled devices can look at Layer 2 and beyond Layer 3 of the OSI model. In some cases, DPI can be invoked to look through Layer 2–7 of the OSI model. This includes headers and data protocol structures, as well as the payload of the message. DPI functionality invokes when a device looks or takes other action, based on information beyond Layer 3 of the OSI model. DPI can identify and classify traffic based on a signature database that includes information extracted from the data part of a packet, allowing more exceptional control than classification based only on header information. Endpoints can utilize encryption and obfuscation techniques to evade DPI actions in many cases.
A classified packet may be redirected, marked/tagged, blocked, rate limited, and of course, reported to a reporting agent in the network. In this way, HTTP errors of different classifications may be identified and forwarded for analysis. Many DPI devices can identify packet flows (rather than packet-by-packet analysis), allowing control actions based on accumulated flow information.
TODAY’S TRAFFIC AND DPI
Today’s traffic and high-speed 100Gb links put severe pressure on vital security tools like Deep packet Inspection (DPI) that inspect traffic to block data leaks and malware. One way of solving this problem is to effectively distribute traffic from 100Gb network links to the security tools running on the lower speed links. This approach will mitigate the gap between the higher data rate of the core network and the lower data processing capacity of the tools and optimize the functionality offered by each tool. To do that, sophisticated load balancers are needed in the enterprise infrastructure, which is increasing the administration cost and the TCO of the infrastructure. The underlying architecture of solving Deep Packet Inspection problem on 100Gbps links are shown below:
FPGA’S ROLE IN DPI
Due to the increasing number of security vulnerabilities and network attacks, the number of Regular Expressions (RE) in DPI is continually growing. At the same time, the speed of networks is growing too — telecommunication companies started to deploy 100 Gbps links, the 400 Gbps Ethernet standard has recently been ratified, and large data centers already call for a 1 Tbps technology.
Consequently, despite many proposed optimizations, existing DPIs are still far from being able to process the traffic in current high-speed networks at the line speed. The best software-based solution we are aware of is the one that can achieve a 100 Gbps throughput using a cluster of servers with a well-designed distribution of network traffic. Processing network traffic at such speeds in single-box DPIs is far beyond the capabilities of software-based solutions — hardware acceleration is needed.
Field-programmable gate arrays (FPGAs) are well-suited technology for accelerating DPIs. They provide high computing power and flexibility for network traffic processing, and they are used increasingly in data centers for this purpose.
Why choose FPGA as an acceleration platform? Well, there are several reasons for that.
- Performant enough as an ASIC for specific workloads
- Flexible enough to reconfigure, change schemas, test the market, proof the solution, adjust development, build a viable product based on customer feedback
Meanwhile, FPGAs have their cons as well. It is tough to build a solution on the FPGA silicon, just like creating an ASIC design that yields to the slow FPGA market adaptation as a default computing unit.
Let’s take a more in-depth look at the FPGAs to understand what is under the hood of these chips.
A field-programmable gate array (FPGA) is an integrated circuit (IC) that can be programmed in the field after manufacture. The FPGA configuration is generally specified using a hardware description language (HDL), similar to that used for an Application-Specific Integrated Circuit (ASIC). FPGAs contain an array of programmable logic blocks, and a hierarchy of “reconfigurable interconnects” that allow the blocks to be “wired together,” like many logic gates that can be inter-wired in different configurations. Logic blocks can be configured to perform complex combinational functions or merely simple logic gates like AND and XOR. In most FPGAs, logic blocks also include memory elements, which may be simple flip-flops or more complete blocks of memory. Many FPGAs can be reprogrammed to implement different logic functions, allowing flexible, reconfigurable computing as performed in computer software. The simplified schematic view of the FPGA chips are shown below:
Logic blocks — allow designing digital circuits that perform computation.
Interconnect — allows connecting your logic blocks to develop complex and large designs.
IO Blocks — enable interacting with the different interfaces, network, storage, server’s bus.
*Everything is programmable
*everything is reconfigurable. Change your firmware in milliseconds.
*After successful implementation of FPGA design you can move forward to produce ASIC immediately if necessary
It turns out that FPGAs are perfect devices when it comes to interacting with real-world (network, storage) and pipe-lining / paralleling the data processing.
FPGA BASED PCRE COMPATIBLE REGULAR EXPRESSION IP CORE ATTACHED TO THE NETWORK
GRegeX is an implementation of PCRE compatible regular expression algorithm on FPGA chip achieving 12.8 GB/s throughput with a single IP core. A wide range of supported regular expression functions allows developers to configure desired rules which can be handled in a chip without reducing the throughput. The supported traffic can be doubled to 200Gbps by deploying two cores of GRegeX engine in a single FPGA chip. More detailed information about GRegeX can be found on the following links:
More detailed explanation on the Medium:
The Fastest PCRE Compatible Regular Expression IP Core on Xilinx® Alveo™ Accelerator Card
The solution consists of two parts: Regular Expression IP core on the FPGA side and the drivers in Host side: The data sources of the solution can be the NIC of the server using Linux Kernel or DPDK library, the network interface available directly on the acceleration card or any application running on
the Linux environment for feeding the GRegeX Drivers with the data.
GRegeX achieves 12.8 GB/s throughput regardless of the regular expression rule set, while software implementation speed decreases when using more complex regex rules such as brackets and repeat symbols.
GRegex is the single box solution to connect to the network and perform DPI at 100Gbps link speed.
Xilinx SmartNIC/U50 with DPI/NAT/SSL inside a switch
Grovf Inc. is partnering with XCLOUD NETWORKS, which is a network configuration, monitoring, and troubleshooting automation company utilizing the OCP networking hardware with their proprietary software layer to bring the 100Gbps DPI solution directly into the network switch. All DPI configuration is accessible from the XCLOUD NETWORKS software layer.
Grovf’s solution integrated with XCLOUD NETWORKS software layer provides the 100Gbps DPI capability directly from 100G switch layer. Software layer provides all necessary tools for interacting, configuring and maintaining the DPI functionality. When configuring the DPI rules on software layer traffic flowing inside the switch gets redirected to the Grovf’s FPGA device where GRegeX hardware implemented regular expressions are compared against the traffic in the switch. The simple architecture of the system is shown below.
FPGAs are very promising devices in the sphere of DPI as they provide direct interaction with the network traffic in the chip layer also providing additional logic to process the network traffic again within chip without need of transferring the data to the OS and software layer. Integration of this solution with higher level software stack which provides the XCLOUD NETWORKS for network monitoring, troubleshooting and automation directly effects to the user experience of dealing with 100Gbps DPI issues with a single FPGA Half- Height, Half-Length PCIe device. Solution seamlessly integrates with XCLOUD architecture and fully controllable from software stack layer.
Lear more about Grovf
Learn more about XCLOUD NETWORKS