Grovf GCache — Boost your Key-Value Store Performance to 20 mil query/s with a single FPGA chip
The emerging turnovers in the spheres of IoT and Big Data reshape the future of the database industry. Cisco predicts that by 2030 the number of devices connected to the Internet will reach up to 500 billion. In view of the exponential growth of Big Data, the convergence of technological improvements, like enhanced data caching performance is vital than ever. Because of the limitations of application memory size and their random-access nature, processors considerably lack energy-efficiency. Therefore, arises the need for dedicated hardware for IoT-generated data storage, effectively replacing software to accelerate the overall system.
With a view to efficiently store and retrieve cached data, Grovf offers an FPGA-based key-value store — Grovf GCache. The main goals of our design are to drastically boost database performance, minimize latency and power consumption.
Redis drawbacks and FPGA as a solution
Over the last decade, we have seen an increasing trend towards non-relational databases, like key-value stores, due to their flexibility and faster performance. As a type of data storage software program, the key-value database stores data as a set of unique keys, each of which has a corresponding value (a.k.a. “key-value pair”). In part of this data structure, it offers various advantages over traditional databases. The main benefit is fast response time, at write and read operations, by virtue of its data format. One of the most popular open-source key-value stores is Redis in-memory data structure store.
Its high popularity in the market can be explained by the enormous versatility and the wide variety of use cases. Redis is often used for:
- User session data management
- Time series data
- Message queues for workflow
- Caching both static and interactive data
- Real-time analytics
On the other hand, Redis has its own drawbacks with a non-constant memory bandwidth for various packet sizes. The effective throughput depends on the network packet size. For small packet size, the throughput is significantly low and increases with a growth of packet size. To improve data transmission speed for all packet sizes, FPGA can be used. FPGA are integrated circuits with a larger number of logical elements which can be reconfigured after manufacturing. The circuits are designed for fully extract parallelism in the application. This gives us the ability to design fully pipelined applications to achieve full line-rate processing in network applications. As a result, we achieve a 10Gbps line-rate through our solution.
Grovf GCache overview
Most importantly, Grovf GCache key-value store differentiates itself in the marketplace by architectural solutions and the algorithms used in FPGA. To achieve the design goals, all key-value store algorithms are completely implemented on FPGA without any kind of interaction with the host computer. This approach allows us to put schematics of FPGA very close to each other and achieve better speed and latency results. A direct connection between DRAM and FPGA is used to stream data to memory, store, and then retrieve it on demand.
In Grovf design, the network interface on FPGA is implemented to minimize the latency between the network interface and key-value store algorithms. As a result, the computing performance significantly advances and any processing overhead associated with transactions of different systems (such as network adapter and CPU) gets eliminated.
In comparison with the other key-value stores, (such as Memcached and Redis), where the key-value pairs are stored in the hash table structure, we have stored a key and a pointer to the current key-value pair by separating the value storage from the hash table. Consequently, the hash table size does not depend on the value size, and the hash table memory usage efficiency is increased.
With regard to the design characteristics, the key and the key-value pair size of Grovf GCache are limited to 32B and 64KB accordingly, and the supported Memcached operations are set, get, delete.
Performance benchmarking of Grovf GCache and Redis
In order to discover the best performance being achieved, we have conducted the benchmarking of Grovf GCache (archives 10Gb/s throughput) and Redis.
Fig.1 below demonstrates the performance comparison based on network packet size from 64B to 1000B and Fig. 2 from 64B to 8000B.
As can be seen, in the case of small packet size our solution is significantly faster than Redis, such as that 10 times more performance efficiency is achieved with 100B packet size.
In terms of latency, Grovf GCache runs with 1.5–6us wait time, while the latency of Redis application is about 30us when using network connection․ As a result, our design enhances the database performance by 5–20 times in comparison with Redis.
In reference to the techniques of the presented comparative analysis, we have used n1-highcpu-96 instances on Google Cloud for Redis benchmark. The results of the Redis Set and Get demonstrate the measured requests per second for 10 commands pipelines.
In our experimental set-up, we stored the FPGA with 1M key-value pairs of constant size, then sent GET requests at the maximum rate the 10Gbps Ethernet connection allowed. Then, this experiment was repeated for different network packet sizes ranging from 64B to 8000B.
For benchmarking we use the Mellanox ConnectX network card and kintex7 FPGA controller.
Key-value stores that are widely used for web applications, real-time applications, and in-memory data caching, nowadays face real throughput bottlenecks. In order to improve their degree of efficiency and optimize the latency results, Grovf offers a highly competitive key-value store solution. Achieved results show that Grovf GCache can be used to optimize the computing efficiency of IIoT generated data, web servers data caching, etc.
In the future, we aim to increase value size up to 1MB and support Memcached get operation fully, which will enable us to read multiple key-value pairs from memory by using one get command.