Anatomy of DPDK Data Structure

Anubhav Choudhary
8 min readApr 13, 2019

--

src: dpdk.org

The Data Plane Development Kit (DPDK) is a library that accelerates the packet processing workloads and runs on a wide variety of CPU architectures.

The DPDK library provides a broad set of data structures which are suitable for different purposes like doing fast memory lookup, doing longest prefix match etc. These data structures are optimised to give the best performance on any of the supported CPU architecture. All the data structures are NUMA aware which means the required memory is always allocated from the RAM from the same socket where the CPU core is located. In this article, I will present some of the mostly used data-structures and I will showcase how those data-structures are designed along with its performance.

The data structures that are covered here are:

  • DPDK Hash (rte_hash)
  • DPDK LPM (rte_lpm)

DPDK Hash (rte_hash)

Overview
While writing a DPDK application, sometimes we need a lookup table to do some operations, for example, a table that gives the address of next-hop (or next destination) by taking an IP address as input. In this case, the lookup operation is done every time the application sees a new packet, thus the data structure used for the lookup must be very efficient with low latency.

DPDK provides a hash table support which enables the application to do fast memory lookups. A hash table stores key-value pair and the key can be used to extract the value efficiently. Asymptotically, it takes O(1) time to fetch any value using the key.

Another key feature of DPDK Hash is multithread support. The user just needs to enable multithread using a flag while creating the hash table. Irrespective of multithreading is enabled or not, the lookup operation is always thread-safe which means multiple threads can do the lookup operation on the same hash-table simultaneously. However, If the user wants to do write operation from multiple threads, the multithreading must be enabled.

Design
The DPDK hash maintains two types of table internally, they are :

  • Bucket Table
  • Key Table

The Bucket Table is a pre-allocated array of the buckets where the total number of the bucket is determined using the total number of entries we want to store. The Key Table is also a pre-allocated array where all the key-value pair are actually stored. The maximum capacity of elements in the Key Table is exactly the eight times that of Bucket Table, this is because each can bucket can hold up to 8 key-pair values.

When a new entry is added, the key of the entry is used to calculate the hash value. The hash value can be calculated any of the hash function like rte_hash_crc, rte_jhash or any custom hashing algorithm. The calculated hash value is then utilized to find the right bucket from the Bucket Table. Once right bucket is identified and there is an empty slot in it, the entry is directly stored there. Otherwise, the situation is called hash collision and it is resolved using the cuckoo hash algorithm.

Memory Requirement
The total memory required by a DPDK hash can be calculated by adding the size of the hash table and the size of all the data we want to store in it. In this section, we will find how to estimate the size of the hash table only.

The memory used by the hash table can be estimated using the parameters used while creating DPDK hash. The parameters which determine the memory requirement are the number of entries to be stored in the hash table and the size of the key. These two parameters are used to calculate the size of the bucket table and the size of the key table. The following equation shows how the size of the hash table can be estimated.

Size of Hash Table = Size of Hash Structure + Size of Bucket Table + Size of Key Table

The size of the hash structure is fixed which is equal to 192 bytes. And the size of the bucket table and the key table can be calculated as :

Size of Bucket Table = Number of buckets * Size of each bucket (i.e, 64 bytes)
Size of Key Table = Key entry size * Number of key slots
Where,
Number of Buckets = Number of entries (rounded to next 2 pow n) / Entries per bucket (i.e, 8 bytes)
Key entry size = Key size + Size of hash_entry (i.e, y bytes)
Number of key slots = Number of entries + 1

The following table demonstrates the calculation of the memory requirement :

Table 1.1: Hast table memory estimation example

Note: The estimated memory can be slightly lower than that of actual memory used due to extra padding done in the memory by the DPDK for memory alignment and cache alignment.

Performance

The following table shows the results of the performance test done for DPDK Hash. From the result, it is very clear that irrespective of the number of entries in the hash table, time to do operations like lookup, insert or delete takes almost constant time.

Table 1.2: DPDK hash performance test results

DPDK LPM (rte_lpm)

Overview
The LPM stands for “longest prefix match”, it is a special type of key-value pair lookup where we give an IPv4 address as a key to get the corresponding value. This data-structure is mostly used in routing applications where value represents the next-hop.

The LPM contains one or more rules, and each rule is a pair of an IP with prefix and a value. If the input IP matches with more than one rules then the rule with the highest prefix will be used to determine the required value. For example,

LPM  = { (4.0.0.0/8, A), (4.0.1.0/24, B) }The following table shows a few lookup results :
+----------+-------------+
| Input IP | Destination |
+----------+-------------+
| 4.0.0.1 | A |
| 4.0.0.2 | A |
| 4.0.1.1 | B |
| 4.0.1.2 | B |
+----------+-------------+
Explanation:
1. 4.0.0.1 is matching with only 1 rule. And the destination is A.
2. 4.0.1.1 is matching with both the rules. And the destination is B based on the longest prefix.

Design
The design and working principle of LPM are same as Trie data-structure. When a key is passed for the lookup operation, the key is broken down into bytes or words and then each word (or byte) will be used one by one to traverse over the data-structure to get the required value. For example, if the key is “tea”, then we can break it down into “t”, “e” and “a”. And if we use each character one by one to traverse the data structure in the below-shown figure, we will end up in the node which holds the value as 3.

Fig: Trie data-structure (src: wikipedia.org)

The time required to do a lookup operation in tries depends on the number of pieces into which a key is broken. It does not depend on the number of entries present in the tries. Asymptotically,

Time Complexity of Lookup = O(m)where m is the number of bytes/words into which key is broken.

if the length of the key is always constant, then we can say that lookup in trie takes O(1) or constant time.

In the case of DPDK LPM, the size of the key is always 32 bits as it only supports IPv4. For IPv6, we have to use DPDK LPM6. The key in LPM is broken down into two words with 24 bits and 8 bits each. And thus while doing lookup only two memory operations are sufficient. The LPM internally maintains two types of table.

  • 24-bit table
  • An array of 8-bit tables.
  • The rule table

The 24-bit table has 2 power 24 entries and it is possible to hit any entry in the table by using a 24-bit input in constant time. The 8-bit table is same as the 24-bit table, but it is has got only 256 (2 power 8) entries. The rule table does not play any role during lookup operation but, it helps LPM to keep track of duplicate rules.

When an IP address is passed for the lookup, first 24 bits are extracted from it and with the help of the 24-bit table, an appropriate entry is identified which gives the index of a suitable 8-bit table from the 8-bit table array. Once the 8-bit table is identified, the last 8-bits of the IP are used to perform one more lookup. And this lookup gives the final value or the next hop.

Memory Requirement

The DPDK LPM is memory intensive data-structure, the total memory required can be estimated by adding the size of LPM structure which includes the size of the 24-bit table along with other elements, the array of 8-bit tables and the rule table.

Size of DPDK LPM = Size of LPM structure + Size of array of 8-bit table + Size of Rule table

The size of LPM structure is fixed which is 67109248 Bytes (~64 MB) and the memory occupied by the array of 8-bit table and the rule table depends on the total number of 8-bit tables and the maximum number of rules configured while creating the DPDK LPM.

Size of array of 8-bit table = Number of 8-bit table * Size of 8bit table (i.e, 1024 Bytes)Size of Rule table = Number of maximum rules * Size of rule entry (ie, 8 Bytes)

The following table shows the demonstration of memory estimation for DPDK LPM.

Table 2.1: DPDK LPM memory estimation

Note: The estimated memory can be slightly lower than that of actual memory used due to extra padding done in the memory by the DPDK for memory alignment and cache alignment.

Performance
The lookup performance is amazingly fast and it remains constant with respect to the load on the LPM. However, the time to insert a new rule increases with the load and time to delete an existing rule is highest when the LPM is half filled.

Table 2.2: DPDK LPM performance test results

Comparison of DPDK Hash and DPDK LPM

  • Lookup time of LPM is ~15000x faster as compared to DPDK Hash.
  • Time to insert a new element in DPDK hash is constant with respect to load on the hash table. However, the time required to insert an entry in DPDK LPM increases with load.
  • The DPDK LPM has the additional capability of the doing longest prefix match. The DPDK Hash will always look for an exact key match.

References

--

--

Responses (2)