Boomerang is a indiehackers-like application, facebook homepage, tumblr dashboard, twitter timeline, or combination of these.
- Write posts
- Comment on posts
- Upvote posts
- View homepage timeline
- User has followers and followed by other users
Facts & Numbers
- 10M active users
- 50K queries per second are spent reads
- Hundreds (~500) of requests per second are spent on writes.
Users consume more than they produce. More time is spent on read than on write.
The CDN will be used to cache the asset files (images, CSS and JS).
2. Load Balancer (+ more)
3. Web app servers (+ more)
Moving towards distributed services based on different features we have.
- Timeline service: Knows about user home timeline.
- Post service: Saves and Returns posts.
- Comment service: Saves and Returns comments of a post.
- User service: Returns user data.
5. Job Queues (+ more)
We rely on caching to cache all user data, including timeline, posts, comments, votes, etc. We’ll use Redis. It has a persistence and replication, with flexibility on the implementation.
A relational database, like MySQL, would work fine with sharding and replication.
8. Cloud storage (+ more)
Part I — Reading Pipeline
In our case, we have 3 major readings actions: (1) Timeline, (2) Post and It’s comments, and (3) User data.
What’s in here?
The whole data in the database and cache will be split into logical boxes. Each box is independent, a self-contained, has it’s own sharded database, caches, and has all the data for a range of users.
So, It means a box can satisfy all the operations for a given user.
Users are mapped into boxes. Users are randomly distributed across boxes. All user data (posts, comments, etc) is stored on the same box, and same shard. And the shard data is time ordered.
Rendering a user profile, or timeline does not take multiple cross shard queries. It’s fast. Huge advantage. No need to scatter-gather, jump around different shards. Shards don’t communicate with each other. Easy to implement.
For high availability, shards run in multi-master replication mode. Each master is assigned to a different availability zone.
All tables exist on all shards. It means every shard has posts, comments, users, (or any) tables. All data is in a shard is either an object (post, comment, user) or a mapping like M-M or 1-M tables (user has posts, post has comments).
Other applications do shard based on features, say we have shards for posts, another for comment, and another for users, and so on.
So, what happens when the user view his timeline?
The timeline service goes to the user’s box. It then gets the user’s timeline from the cache. Every user’s home timeline is replicated across different machines. If one goes down, the other can pick up.
The timeline stored in the cache contains only the post ids, and not the posts itself (title, content, votes, etc).
Well, we need to get the posts titles. Given a list of post ids, go to user’s shard, and get a list of (latest) user post titles. Usually, posts are found in the cache. That’s another cache in front of the database; posts cache.
Both, user timeline and posts caches are intentionally limited, and store the latests posts. Older entries in the cache will be removed when it’s full.
What if there was a cache miss on user timeline?
Then, it’s a disk hit. A timeline re-creation.
Go to the user’s shard, get a list of timeline (latest) post ids, and then save back to the timeline cache.
Next is to get the posts given a list of post ids we just queried. If the posts aren’t in the cache, then It’s another database hit.
It’s important to remember that one shard stores ALL user timeline data. It’s made up of posts from followed users. No need to track list of followed users.
Harley Quinn on the other hand maintains a list of users’ followers, and users’ tweets ids, stored in a graph database or using indexes.
How we lookup at a specific user data in a shard?
Instead of having a mapping table that maps ids of, say, post, comment, user to different shards, which will require a lookup; an extra level of indirection. We can inject the location of the data to be fetched in it’s id itself.
The data can be any user data, like posts, comments, or user meta-data.
So, If you want to look up a user whose id falls into, say, shard003.
One. Decompose the id into its 3 parts.
The first part is the shard id, second is the type of the data (post, comment, user-meta, or any), and last part is the local id; The row id within the table.
Two. Connect to the shard, go to the users table (or posts in case type is posts), and use the local id to find the user and return the data.
Harley Quinn uses a mapping table to map tweet ids to different shards.
It seems there are chances to hit the database. Is there anyway to avoid it?
Indexes around the ids in tables; user id, post id, comment id.
Indexes around the relationships; user id for created post or comment, and creation date would make the queries much faster. This can be done within the mapping tables in the shard we talked about earlier.
Most of the users are interested in viewing the latest posts. Chances are, they’re cached. Low chances to hit the database, though, it’s possible.
What about the viewing a post content, along with it’s comments?
It follows almost the same path as the timeline.
The post service goes to the post cache, and ask, do you have the post given the post id?. If yes, then return it. The comments can also be attached to a post.
If not. Then, it’s a database hit.
What happens If it’s a database hit?
Same. Given the shard id, type is posts (posts table), and the local (row) id, get the post from the database, and store it back in the cache.
Finally, the user data? User profile.
Again. User service will …. You got the idea.
Given the shard id, type is users (users table), and the local (row) id, get the user from the database. We can cache the user meta-data if it worth.
Part II — Writing Pipeline
Horray! Now, what happens when you write a new post?
A user publishes a post, the post is pushed to the user’s shard passing through the firehose. It’s cached in the post cache.
And because we are also writing to all followers timelines. In other words, all followers should have the published post in their boxes as well. The timeline cache and post cache will also be populated too.
So, get a list of all user’s followers from the user’s shard, iterate over their boxes, and insert in their box’s database, and caches.
That’s the benefit we get from doing the processing; spreading out published posts on write rather than on read. So, read operations are very fast.
What if a user has a lot of followers. Are we going to iterate over all of them?
Well, posts from active (those with many followers) users mightn’t be actually published to everyone. Just select, intelligently or randomly, a subset of them.
That’s a lot of duplication, isn’t it?
That’s right. We sacrifice the duplication for performance.
Other scenarios: When I comment on someone else’s post. The comment should be stored in his shard, and not mine.
Though, It can be stored in my shard as well. This is a good tradeoff between performance and disk use.
It seems the process of writing a post takes a lot of time.
Yes. Writing operations are passed into the queue for asynchronous execution, at the background. We don’t want to hang out the user until the post is sent to all his followers.
The good news is, since we split our datasource, that’s our database and caches into separate logical boxes for range of users. Each box becomes a unit of parallelization that can be adjusted to any size as the user base grows.
What if some users on a shard are extremely active to an extend that they broke the balance between the shards?
We can do rebalancing; migration of certain users manually.
Doing it automatically has some complications; It might cause data corruption, and improper balancing that cannot be easily fixed.
Thank you for reading! If you enjoyed it, please clap 👏 for it.