Reddit System Design

I A KHAN
4 min readJun 9, 2022

--

This blog is all about high-level system design for Reddit App.

Let’s start from the very beginning :-

Step 1: Gathering Requirement. (Ask as much as you can to yourself Or your interviewer if you are in the same 😉)

  • Total User base — 50 million (Let’s consider)
  • Each user is a vising app 5 times
  • HomePage with 25 posts to be loaded at a time.
  • Let’s consider we have content as “Text” & “Image ”.

(For video approach will be similar to image one)

  • The latest should be shown each time the user launch App

(The latest means posts which have been posted in the last 24 hours).

This is it………… We are done with our requirement gathering(Step 1) 🙂

Step 2: Understanding the technicality of the above requirements gathered. (try to come up with assumptions as per your experience or confirm/ask with the interviewer)

  • 50 million (user base) * 5 visits/user = 250 million hits on homepage.
  • 250 million page hits * 25 posts = 6.25 billion posts.
  • Let's suppose the “title” is of max 50 chars & space needed for it will be 2bytes/char = 100 bytes.
  • “comments” count in integer take space = 4 bytes
  • timestamp of the post in float = 8 bytes.
  • Content body supposes if of 300 chars so space will be 600 bytes (300 chars*2 bytes/chars “Simple math”).
  • The size of the image let's consider on average will be 100KB
  • hyperlink to detail post is of 100 characters, so space will be 100*2 =200 bytes.
  • Let's suppose 50% of posts have images & rest have texts only (covering both cases)

Let's calculate the total space needed

0.5 *(100 + 4 + 8 + 100000) posts with images + 0.5 * (100 + 4 + 8 + 600) posts without images ~ 50 KB (around)

Memory for 6.2 billion posts (6.2 biliion *50KB)= TOOOO MUCH (can’t afford this)

Any optimization here, think for a minute????

Hurray i found one thing that can help us here

What if we instead of sending image (100KB) , we send only thumbnail (on average size of thumbnail is 10 % of original size) & when user explicitly ask for image i.e when user open full post detail then only we can pass original image.

what we have save here???

Now using the thumbnail approach memory footprint will be reduced drastically

Memory for 6.2 billion posts (6.2 biliion *5KB)= 1/10 * (TOOOO MUCH )

I guess now we can move to the next step as our understanding of requirements is strong now.

Step 3: We have to list down the Component, Services, and Storage required for starting with designing the system.

  • App architecture: MVVM architecture will be good for this use case you can also opt for another architecture but you must be able to explain why you choosing particular architecture. (Specially added this as I have an android background, you can skip as we are targetting high-level system design only)
  • Microservices architecture: to design scalable microservices you must have an understanding of architecture whichever you are choosing.
  • Server: which will host your all services & responds to requests for post & all other things.
  • Ranking Service: This service will be responsible for assigning ranks to posts, like which posts are more relevant to which user based on multiple criteria like user’s previous activity, following, likes-dislikes, etc.
  • Feed Services: This service will be responsible for developing feeds for user-specific to their behavior, choices, etc.
  • User services: This service will keep track of user records, etc. also this will have DB as well,
  • post services: This service will keep track of user’s posts and also has DB to store these posts' data. For posts services, we should opt for NoSql servers (because for multiple reasons like horizontal scalability, posts mostly have an unstructured data format, etc. check here for more https://www.integrate.io/blog/the-sql-vs-nosql-difference/#:~:text=SQL%20databases%20are%20vertically%20scalable,data%20like%20documents%20or%20JSON).
  • Object Store: To store images we need to store objects which can be AWS S3, Google cloud storage, etc.
  • Top post caches: To keep users in sync always with the latest posts we maintain a cache for top posts.
  • Top post crawler/notifier: To keep top post caches updated to the latest post we should have a crawler or notifier which will keep crawling to Post DB for updates & if found it will update the cache.
  • Load Balancer: For addressing a large number of requests we should use a balancer whose responsibility is to evenly divide requests to all services we have. For e.g, we have 10 servers installed now request number 12 reaches to load balancer now load balancer performs some standard algorithm (May be 12%10 =2 so the request will be redirected to server 2) & redirect the request to the best suitable server.

NOTE: All Server functionality will be the same , for simplicity i have draw system diagram for server 1 only.

Image for the System design diagram

“Sorry for Bad Diagram & Writing Guys”

For any queries & corrections/updation requests reach out to me on LinkedIn https://www.linkedin.com/in/imtiyaz-ahmad-khan-65843a126/ OR https://imtiyaz-khan-32866.web.app/#/

--

--

I A KHAN

Android & Flutter Expert with 6+ years of experience