Streaming Sensor/Camera Data Using RDMA: A Paradigm Shift Beyond UDP and TCP

Khachik Sahakyan
grovf
Published in
3 min readAug 17, 2023

Introduction

Remote Direct Memory Access (RDMA) over Ethernet is fast becoming the go-to choice for data centers, high-performance computing (HPC), and various applications necessitating rapid data transport with minimal latency. One of the most emerging and critical applications is streaming sensor and camera data from Field Programmable Gate Arrays (FPGAs) to remote servers or Graphics Processing Unit (GPU) direct clusters. As we venture into this topic, let’s understand why RDMA holds potential advantages over traditional protocols like UDP and TCP.

Understanding the Traditional Protocols: UDP and TCP

  1. UDP (User Datagram Protocol): UDP is a connectionless protocol suitable for applications where speed is more crucial than reliability. It is simple and involves minimal overhead. While suitable for many real-time streaming applications, the lack of reliability and guaranteed delivery means that it’s not always the best choice for crucial sensor data.
  2. TCP (Transmission Control Protocol): A connection-oriented protocol ensuring that data packets reach their destination in the correct order. However, its overhead and the involvement of the host software stack make it less suitable for real-time high-speed data transport.

Advantages of RDMA over UDP and TCP

  1. Latency: RDMA offers significantly lower latency compared to TCP or UDP because it bypasses the host software stack. When microseconds matter, especially in applications like real-time camera feeds or sensor data analysis, RDMA emerges as a clear winner.
  2. Throughput: RDMA supports higher bandwidths and, thus, can stream dense sensor and camera data without saturating the network links. This scalability ensures consistent performance as data loads increase.
  3. Quality of Service: Unlike UDP, RDMA offers reliability without the overhead of TCP. This makes it suitable for applications that need both speed and guaranteed data delivery.
  4. Memory to Memory: RDMA allows transferring the memory of the Camera or Sensor acquisition device directly to the remote server (GPU) memory.

Benefits of Avoiding Memory Copies and CPU Involvement

One of the primary advantages of RDMA is its ability to perform data transfers directly between memories of the source and destination systems. Here’s why this is significant:

  1. Efficiency: Bypassing the need for memory copies means data can be moved directly from the FPGA to the destination without interim stops. This accelerates data transfer, ensuring high-speed processing of camera or sensor data.
  2. Reduced CPU Load: In traditional data transfer mechanisms, CPU involvement is essential for packet processing, data copying, and managing network connections. With RDMA, CPU intervention is minimal, freeing up computational resources for other vital tasks. This is especially crucial in data-intensive applications where the CPU can be a bottleneck.
  3. Consistency: Bypassing memory copies and reducing CPU involvement results in predictable transfer times. For applications like real-time video analysis or time-sensitive sensor data, consistency is as crucial as speed.
  4. Energy Efficiency: Less CPU intervention and fewer data copies mean reduced power consumption. For large data centers or continuous streaming applications, energy savings can be significant.

Solution Architecture with GROVF RDMA FPGA implementation.

Grovf RDMA IP for FPGA is a fully functional RDMA compatible, hardware-implemented solution that can work in both directions at the same time making it a unique solution for this use-case. Moreover, in parallel with supporting standard OFED Verbs API, the Grovf solution supports FPGA native API which allows FPGA developers to initiate RDMA transfers directly from FPGA logic bypassing CPU and software stack. The example architecture of implementing a Camera or Sensor streaming using RDMA is presented in the picture below.

Picture 1. Example architecture of streaming Camera or Sensor data directly to remote Host CPU or GPU Memory avoiding memory Copies and CPU cycles.

Picture 1. Example architecture of streaming Camera or Sensor data directly to remote Host CPU or GPU Memory avoiding memory Copies and CPU cycles.

Once data is in the Host CPU or GPU memory User application can further process it with the lowest possible latency. Examples of this include but are not limited to image processing, image reconstruction, ML applied to video data using GPUs, doing RF signal real-time FFT using GPUs, and more.

Conclusion

RDMA offers a transformative approach to data streaming from FPGA-based sensors or cameras to remote servers or GPU clusters. As the demand for real-time, high-bandwidth, low-latency data transfer grows, transitioning to RDMA over traditional UDP and TCP becomes not just optimal but necessary. By ensuring efficient, rapid, and reliable data transfer without bogging down the CPU or incurring additional memory operations, RDMA stands out as the future of data streaming in high-performance environments.

--

--