Files
system-design-101/data/guides/how-discord-stores-trillions-of-messages.md
Kamran Ahmed ee4b7305a2 Adds ByteByteGo guides and links (#106)
This PR adds all the guides from [Visual
Guides](https://bytebytego.com/guides/) section on bytebytego to the
repository with proper links.

- [x] Markdown files for guides and categories are placed inside
`data/guides` and `data/categories`
- [x] Guide links in readme are auto-generated using
`scripts/readme.ts`. Everytime you run the script `npm run
update-readme`, it reads the categories and guides from the above
mentioned folders, generate production links for guides and categories
and populate the table of content in the readme. This ensures that any
future guides and categories will automatically get added to the readme.
- [x] Sorting inside the readme matches the actual category and guides
sorting on production
2025-03-31 22:16:44 -07:00

35 lines
1.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: How Discord Stores Trillions of Messages
description: Learn how Discord evolved its message storage to handle trillions.
image: 'https://assets.bytebytego.com/diagrams/0174-discord-store-messages.png'
createdAt: '2024-03-12'
draft: false
categories:
- real-world-case-studies
tags:
- Databases
- Architecture
---
![](https://assets.bytebytego.com/diagrams/0174-discord-store-messages.png)
The diagram above shows the evolution of message storage at Discord:
MongoDB ➡️ Cassandra ➡️ ScyllaDB
In 2015, the first version of Discord was built on top of a single MongoDB replica. Around Nov 2015, MongoDB stored 100 million messages and the RAM couldnt hold the data and index any longer. The latency became unpredictable. Message storage needs to be moved to another database. Cassandra was chosen.
In 2017, Discord had 12 Cassandra nodes and stored billions of messages.
At the beginning of 2022, it had 177 nodes with trillions of messages. At this point, latency was unpredictable, and maintenance operations became too expensive to run.
There are several reasons for the issue:
* Cassandra uses the LSM tree for the internal data structure. The reads are more expensive than the writes. There can be many concurrent reads on a server with hundreds of users, resulting in hotspots.
* Maintaining clusters, such as compacting SSTables, impacts performance.
* Garbage collection pauses would cause significant latency spikes
ScyllaDB is Cassandra compatible database written in C++. Discord redesigned its architecture to have a monolithic API, a data service written in Rust, and ScyllaDB-based storage.
The p99 read latency in ScyllaDB is 15ms compared to 40-125ms in Cassandra. The p99 write latency is 5ms compared to 5-70ms in Cassandra.