Walking Through The System Design Interview
What if we had 1 billion users? Does your design still scale?
If you haven’t gone through a technical interview, then you might not know what a system design interview round is like.
A system design interview round will often only involve one or two big questions in which the interviewer will ask you to design a system or app that usually already exists.
These rounds are not just for software engineers. Data engineers, support engineers, and even data scientists can find themselves having to design systems to replicate TinyUrl, Twitter, Instagram, Uber, or a Parking lot app.
Why Is System Design Important?
The purpose of this question is to go beyond your standard data structures and algorithm questions. The interviewer is looking to make sure you can think about the big picture.
They want to see that you think about operational scenarios, edge cases, limitations, and assumptions. Not all engineers want this skill. Some programmers want to focus only on programming and less about the big picture.
But the big picture is important (contrary to the focal point of my satire on over-experienced engineers)! Just to clarify, the big picture starts at the end-user and ends at the back-end. Some may argue that the recent Boeing accidents had a bad system design. From what it sounds like the actual software and hardware worked well but the end-user aspect was not correctly handled. This is something engineers have to take into account. Even if your system is designed to spec, it doesn’t mean the application or hardware will be used correctly.
The desire to focus on the bigger picture and see these kind of edge cases before they happen is how engineers go from junior to senior.
So the question becomes, how do you take on the system design interview question?
Your first step should be to list out the features you will be designing for. For example, if you are designing an Instagram copycat, then you will probably need to be able to upload images, comment/like, and follow users. Twitter is honestly not that much different, except it focuses more on words and less on images.
The point of this stage is just to lay out what areas you will be going more in-depth in. It also gives the interviewer a chance to guide you in a different direction if they have specific features they are interested in seeing.
Generally, I feel most comfortable starting in the database before discussing the application side.
For example, in Twitter’s case, we can simplify this to three tables for now. That would be users, tweets, and followers. There are a lot of other features that at some point might need to be included, including likes and trends. But you have limited time and if your interviewer doesn’t pry too far into it, then don’t get bogged down in the details. This is about high-level design, so discussing the type of database and every specific field isn’t necessary.
It can be a quick trip to over-complication if you try adding too many features and tables to support said features in one go.
Once the tables are developed, you can now focus on the other side: the user and how they interact with the app. We are assuming that this is an app for now and thus you will have a phone making requests, more than likely as standard HTTPS requests to some sort of externally facing load balancer.
Internally, your system probably has several services that then take the requests from the load balancer.
In this case, you can break down the services. For example, there is probably a service for user, tweets, and the timeline or feed. The feed is probably one of the common services the app pulls from when you land on the homepage of the app.
This is a key point because it lets us know that Twitter is a read-heavy app. As you’re designing forward, it is important to note because it changes how you architect your back-end.
Now if you recall, earlier we roughly designed a database to manage tweets.
With our current infrastructure, there is an issue.
An end-user signs into their phone, which sends a request to the load balancer to get the feed. Then the gateway sends a request to a feed service that will then need to send a request to the database before getting the information.
Have you ever tried to run a select statement on a database? Even if you have some sort of parent-child relationship with the databases or add lots of nodes, it really doesn’t scale that well—having millions of users and terabytes of data would make this process quite slow.
Think about it, if a user wants to get their feed and they follow 300 people, then you would have to either send off 300 queries that select where user_id equals 1–300 or write some sort of massive “user_id IN (“1”,”2"…”300")” both of which are extremely slow and would eventually bog down the server when millions of users start querying hundreds of thousands of followers.
How would twitter handle this?
Generally, an interviewer at this point might start prodding your design by adding in what-ifs like what if the users were to double overnight. This will force you to not only think about functionality but also about scale.
This is one of the key focuses of a system design interview. They don’t just want to see that you can program features, they want to see that you can design for scale.
In this case, the answer is they pre-compute timelines before users make a request and store it in memory. In particular, they would be likely to use a Redis in memory database.
First we need to send the tweet to the Redis DB in order for the tweet to be persistent. Now we send the tweet to all the followers who follow this person. This is where the fan-out portion comes into play. The tweet will now be updated in each of these followers timelines. This update will occur regardless of if the user is currently signed in or not. This way, the computation for the queries does not need to occur in real time.
The home timeline could essentially be represented by an array or list of tweet_ids that can be quickly pulled from the tweet table when a user logs on. This can help handle the scale issue from the perspective of normal users.
Edge Cases and Scale
There are further concepts you will need to consider when developing this application.
For example, how will you handle edge cases where a user has millions of followers? Will it still be ideal to precompute for all of the millions of users?
Can you foresee any problems using the pre-compute technique?
When you are doing a system design interview, answering one question will usually just lead to another.
So do make sure to think about edge cases!
System design interviews are actually a lot of fun for those of us who like thinking through an entire system. They can be challenging and make you go beyond just coding.
How will system A interact with system B? What happens when we increase users or add new countries?
What about if there is a power outage or AWS breaks?
Answering all the what ifs and coming up with good solutions is a personal favorite. So don’t let the system design interview round scare you. Think about a few apps and try designing them out!