Part I — NVIDIA Mellanox Bluefield-2 SmartNIC Hands-On Tutorial: Install Drivers and Access the SmartNIC
--
I have gotten my hands dirty with NVIDIA Bluefield-2 SmartNIC deployed at Cloudlab’s facility @ Clemson. If you ever wondered to buy a Bluefield SmartNIC, now I can show you how to test them and get your first impressions for free.
[UPDATE 08/2023]: I started to revise my tutorials here by reproducing them from scratch. The content below has been updated accordingly without explicitly mentioning it at every single instance.
What is Mellanox Bluefield-2
For some time ago, the networking industry is going through a big revolution. There have been many headlines lately about the end of Moore’s law and how the continuous close-to-exponential improvement of general-purpose processors is degrading. However, the data and the corresponding network traffic are doing the opposite. To efficiently keep up with this increasing data processing need, common network interface cards, a.k.a. NICs, have become programmable. Programmable means that we can offload some parts of the data processing to the NIC (which are thus termed as SmartNIC), thereby alleviating the host CPU to do the actual processing, i.e., data interpretation, visualization, content generation, etc.
There are different types of SmartNICs, and various branches of research or IT stakeholders look at them differently. Even the definition of “programmable” is (unfortunately) not carved into stone. Like in programmable network devices, “programmable” can mean remotely configurable, extended features, or explicitly that the device itself can be programmed like a general-purpose architecture.
This approach even applies to NICs, more precisely, SmartNICs, too. For instance, many traditional NICs (found in your laptop, server, etc.) are already “smart” at some point. Some part of the packet processing is offloaded to the NIC, i.e., the NIC does more than just packet buffer allocations and sending packets back and forth from the physical ports to the CPU. Your NIC already does checksum calculations and TCP segmentation offloading in hardware, just to mention the most common ones. Below, you can see all the features my laptop’s Wi-Fi interface supports, i.e., can do in hardware. According…