How Discord save billions of messages

Siddharth Sabron
3 min readMay 16, 2023

--

Hey there, amazing followers! Today, I want to share an incredible story about Discord’s adventure in choosing and transitioning to a new database solution to handle their vast number of stored messages. 🌟💬

Back in 2017, an enlightening article was published, shedding light on the fascinating world of data storage. 🗞️✨ To truly grasp the essence of the content, it’s essential to delve into the captivating realm of Cassandra and its exceptional data read/write capabilities. 📖💡

So, what can we take away from this enthralling journey? Let’s dive in and explore the key highlights: 📝🔎

Discord took a proactive approach by meticulously analyzing their read/write patterns, considering factors such as traffic and the read/write ratio across different types of channels. 📈📊 For instance, they noticed that voice chat-heavy channels accumulated significantly fewer messages compared to text chat-dominated channels. 🎙️📱

In order to streamline their operations, Discord sought a linearly scalable and fault-tolerant database, ensuring smooth sailing even during peak usage. 🌐🚀 To stay on top of their game, they set up alerts for when the P95 response time of their API exceeded 80ms. After all, in a messaging service like Discord, every second counts!️⏱️💬

Discord made a significant switch from MongoDB to Cassandra. 🔄🔀 Cassandra, being a KKV store, relies on a partition key and a clustering key to identify rows within a partition. In their previous setup, Discord used a compound key (channel_id, created_at) in MongoDB. However, they realized that created_at wasn’t the most suitable clustering key due to the possibility of messages sharing the same creation time. 😮😕 Fortunately, Discord had already implemented Snowflake, a chronologically sorted ID generation system. This allowed them to redefine their primary key as (channel_id, message_id), ensuring uniqueness and efficiency. 🆔🔑

To tackle the challenges posed by GC pressure and partition size limitations, Discord ingeniously adjusted their primary key to include a new element: the “bucket” parameter. 📦⏰ Analyzing their largest channel, they determined that each bucket would hold messages for a period of 10 days, striking a perfect balance between performance and resource management. 📆⚖️

In Cassandra, when data is deleted, it’s not instantly removed from the disk. Instead, Cassandra creates tombstones, indicating the deletion. During reads, Cassandra sifts through these tombstones, resulting in increased latency. 😩 To combat this challenge, Discord optimized their deleting/writing process, shortening the lifespan of tombstones and reducing the number encountered during reads. They also conducted regular Cassandra repairs to maintain data consistency. Furthermore, Discord cleverly kept track of empty partitions, enhancing read performance by bypassing unnecessary data. 🧹🔍

Discord accomplished this impressive migration with just four talented backend engineers, proving that extraordinary feats can be achieved with the right skills and dedication. 🙌👩‍💻👨‍💻

I hope you found this journey through Discord’s database

If you enjoyed reading this, please check out my other article, follow me, and feel free to spread the world about it.

50%of my reader do not follow me 😩, pleas follow along.

Here you can get to know more abut me and my work Profilio , Linkedin

--

--