How Discord Stores Trillions of Messages

Dr Milan Milanović
2 min readDec 26, 2023

Discord engineers recently posted a description of their billion-message message storage system, which they started in 2017. When they migrated from MongoDB to 12 Cassandra nodes to store billions of messages, they had different problems, which were unpredictable. A server with small groups of friends tended to send a lot fewer messages than a server with hundreds of thousands of people. Additionally, they discovered that because readings in Cassandra require querying the memtable and on-disk files, they are far more expensive than writing.

Their solution was to migrate to ScyllaDB, a Cassandra-compatible database written in C++, which promises better performance and garbage collection. It is an open-source distributed NoSQL wide-column data store compatible with Apache Cassandra.

They started small but then decided to move all their databases. Nevertheless, despite all this, hot partitions-many concurrent reads on a server that causes latencies-persisted. They attempted to address the issue using data services between a ScyllaDB database cluster and their API monolith.

They used Rust to write data services, which gave them a C/C++ speed with thread and memory safety. In addition, these data services contain one gRPC endpoint per DB query, and if more users query the same data, the DB will be queried only once.

Ultimately, this prevented the DB from getting traffic spikes if someone sent a message to everyone on a large server. Their tail latencies improved

--

--