Our Investment in DragonflyDB — The Most Performant In-Memory Data Store

Astasia Myers
Memory Leak
Published in
3 min readMar 21, 2023

--

VC Astasia Myers’ perspectives on machine learning, cloud infrastructure, developer tools, open source, and security. Sign up here.

In our current TikTok era users have ever increasing expectations of application performance and responsiveness. Applications that are fast receive superior product reviews and outperform the competition, while those that feel sluggish fall behind. A site that loads in 1 second has a conversion rate 3x higher than a site that loads in 5 seconds and 5x higher than a site that loads in 10 seconds (Portent, 2022).

One widely adopted mechanism for improving application performance and scale is a data cache, a high-speed data storage layer which stores a subset of data, typically transient in nature, so that future requests for that data are served up faster than is possible by accessing the data’s primary storage location. A cache minimizes the number of queries that have to go to the backend database, the slower storage tier. Nearly every application uses caching infrastructure because it can have an immediate impact on user experience and business performance.

While data caching has been part of the application stack for decades, now more demands are put on it than ever before with the rise of globally distributed users and shorter attention spans. There is a need for speed. Starting from first principles, if a new cache (e.g. in memory data store) was created what would it look like?

Ex-Google engineers Oded Poncz and Roman Gershman both experienced firsthand the challenges of implementing and scaling a cache so asked this very question. Having worked on the AWS ElastiCache team, Roman directly saw the pain points users experienced with incumbents at scale. Through their experience and insights, they built DragonflyDB, a drop-in Redis replacement that scales vertically to support millions of operations per second and terabyte sized workloads, all on a single instance.

Dragonfly implements novel algorithms and data structures on top of a multi-threaded, shared-nothing architecture. It utilizes an innovative hash table structure called dashtable to minimize memory overhead and tail latency. Dragonfly also uses bitpacking and denseSet techniques to compress the in-memory data, making it on average 30% more memory efficient than Redis. It leverages consistent memory during the snapshotting, eliminating the need to over-provision memory. Importantly Dragonfly has a unique Least Frequently Recently Used (LFRU) cache policy that is resistant to fluctuations in traffic. These technical choices make it more memory efficient with a higher hit ratio and throughput than alternatives. As a result, in one benchmark Dragonfly reached 25x performance compared to Redis.

Since launching in May 2022, Dragonfly has achieved +17K Github stars and is one of the fastest growing Github repo released in 2022.

Today, the company is announcing the general availability of Dragonfly 1.0 with snapshotting, replication, and high availability (try it here) along with its Series A from Quiet Capital and seed from Redpoint Ventures. We are incredibly excited to support Oded, Roman, and the entire team on their mission to shape the future of memory stores by providing a well-designed, ultra-fast, and cost-effective solution for cloud workloads.

Join the DragonflyDB swarm! You can follow DragonflyDB on Twitter and join their Discord. They are actively hiring so check out opportunities here.

--

--

Astasia Myers
Memory Leak

General Partner @ Felicis, previously Investor @ Redpoint Ventures, Quiet Capital, and Cisco Investments