Files
system-design-101/data/guides/reddit's-core-architecture.md
Kamran Ahmed ee4b7305a2 Adds ByteByteGo guides and links (#106)
This PR adds all the guides from [Visual
Guides](https://bytebytego.com/guides/) section on bytebytego to the
repository with proper links.

- [x] Markdown files for guides and categories are placed inside
`data/guides` and `data/categories`
- [x] Guide links in readme are auto-generated using
`scripts/readme.ts`. Everytime you run the script `npm run
update-readme`, it reads the categories and guides from the above
mentioned folders, generate production links for guides and categories
and populate the table of content in the readme. This ensures that any
future guides and categories will automatically get added to the readme.
- [x] Sorting inside the readme matches the actual category and guides
sorting on production
2025-03-31 22:16:44 -07:00

2.4 KiB
Raw Permalink Blame History

title, description, image, createdAt, draft, categories, tags
title description image createdAt draft categories tags
Reddit's Core Architecture Overview of Reddit's architecture for serving millions of users. https://assets.bytebytego.com/diagrams/0356-the-core-reddit-architecture.png 2024-03-06 false
real-world-case-studies
Architecture
Social Media

A quick look at Reddits Core Architecture that helps it serve over 1 billion users every month.

This information is based on research from many Reddit engineering blogs. But since architecture is ever-evolving, things might have changed in some aspects.

The main points of Reddits architecture are as follows:

  • Reddit uses a Content Delivery Network (CDN) from Fastly as a front for the application.
  • Reddit started using jQuery in early 2009. Later on, they started using Typescript and have now moved to modern Node.js frameworks. Over the years, Reddit has also built mobile apps for Android and iOS.
  • Within the application stack, the load balancer sits in front and routes incoming requests to the appropriate services.
  • Reddit started as a Python-based monolithic application but has since started moving to microservices built using Go.
  • Reddit heavily uses GraphQL for its API layer. In early 2021, they started moving to GraphQL Federation, which is a way to combine multiple smaller GraphQL APIs known as Domain Graph Services (DGS). In 2022, the GraphQL team at Reddit added several new Go subgraphs for core Reddit entities thereby splitting the GraphQL monolith.
  • From a data storage point of view, Reddit relies on Postgres for its core data model. To reduce the load on the database, they use memcached in front of Postgres. Also, they use Cassandra quite heavily for new features mainly because of its resiliency and availability properties.
  • To support data replication and maintain cache consistency, Reddit uses Debezium to run a Change Data Capture process.
  • Expensive operations such as a user voting or submitting a link are deferred to an async job queue via RabbitMQ and processed by job workers. For content safety checks and moderation, they use Kafka to transfer data in real-time to run rules over them.
  • Reddit uses AWS and Kubernetes as the hosting platform for its various apps and internal services.
  • For deployment and infrastructure, they use Spinnaker, Drone CI, and Terraform.