An Efficient Framework to Approach System Design Problems

Li Lu
The Startup
Published in
6 min readAug 10, 2020
A scenic view of the Adirondack Mountains area, upstate New York

In my experience as either a software developer or a mentor for software developers, the buzz word “system design” has been popped up quite frequently. Actually, no matter for real-life software development or software engineering interviews, “system design” is right in the center of the stage. I fully understand the frustrations with the so-called “system design” problems, and I have to be frank that I struggled with them, in either real-life scenarios or interviews, for quite some time in the past. All of these experiences, some rewarding, some humbling, lead me to a journey seeking for a good framework to approach system design challenges in a systematic fashion. Here are some of my thoughts.

The Big Picture

To start with, let’s talk about some general ideas about designing software systems. Specifically, I noticed three basic things make design-related topics very different from other technical topics.

  1. There may be no ultimately perfect design, but multiple sufficiently good solutions. Different engineers may have completely different perspectives, and we need to embrace this diversity. As Shakespeare once said, there are a thousand Hamlets in a thousand people’s eyes.
  2. How you reach the conclusion is far more important than the conclusion itself. While we understand the fact that actions and deliverable results are crucial, the questions raised, the inspirations, and the brainstorms during the discussion further improves the outcome of the design.
  3. Rome wasn’t built in a day. Gathering and understanding all the information for the discussion may require years of experience, so it’s not quite practical to build your design skills just with your adrenaline.

Under these observations, there are two types of real challenges when pursuing a design. The first type is on knowing too few of the field knowledge: in order to deliver good design, one has to have a good understanding of multiple perspectives of computer science, especially computer systems. The second type of challenge is on knowing too much: sometimes we know too much and we realized there are too many problems to solve but we don’t know where to start with. We may randomly pick one way to start but easily got distracted. Then, hours of discussions later (or half an hour later, in an interview) we just realized we’ve got distracted and forgot something crucial.

Our model addresses the second type of challenge mentioned above. For the first type of challenge, I recommend starting to build your Rome gradually, so get a great system design reading list and start reading today.

The Model to Approach a Good Design

So here comes the fun part. Even though system design problems pose significant challenges to our software development process, both in a real working environment and in interviews, I do notice there are general rules. After summarizing these rules, here is a model I use to approach a great design.

Specifically, we can approach a good design by considering the Use cases, the Requirements, the Commitment plans, and the System itself. To make it easy to remember, I simply call it “the URCS model”, dedicated to my Ph.D. program at the Computer Science Department, University of Rochester.

Use cases

Understanding the use cases helps us to properly understand how the system works for our users, and how can we leverage some facts to optimize the system according to use cases. In this step, we convert a vague business requirement into a concrete computer software problem. We can start by asking ourselves:

a. Who are the users of our system?

b. What is the end to end workflow for each type of user?

c. Are there any specific usage patterns? For example, how many readers/writers/admin will be using the system? Who’s frequent? Who’s infrequent?

After we have a good understanding of the use case side, we transferred our problem completely into a technical world. We can then move forward on the technical side.

Requirements

There are multiple perspectives we need to consider to solidly define our system. This is one of the main parts to approach a good design since we really need to know what we are doing afterward. We can start checking by asking us questions from different perspectives.

a. System Scale

i.User: How many users? How many active users? Qps? How good is the quality of the requests?

ii. Data: How much data are we generating and storing when the system is running?

b. Latency requirements

i. Shall we deliver an online system or an offline system? What is the expected latency of the system? Micro/Nanoseconds? Milliseconds? Minutes? Hours? Days?

ii. Based on the latency requirement, for disk IOs, are we limited by seeks or throughput? What about network IOs?

iii. Shall we optimize for average numbers, or 80 percentile, or 99 percentile?

c. Bandwidth consumptions

i. Based on the data we will process (covered in item a, ii), will the disk IO or network become the bottleneck?

d. Semantics

i. Fault tolerance of the system: what if one component goes wrong? What is in-memory? What will we lose?

ii. Do we need high availability? If so, decide the HA strategy for each component

iii. Do we maintain strong consistency? If so, how do we guarantee this even with network partitions?

iv. Notice the limitation of the CAP theorem.

e. Other potential limitations/upper bounds

i. System: the number of nodes? the number of active connections? the number of total network connections (capped by the number of ports, fixed number)?

ii. User: upper limit capped by the total population. Can we further cap this? For example, cap by organization sizes, or by their geolocation.

iii. Other requirements: Privacy? Security? Politeness/interference when running the system?

Commitment plan

We then need to be sure about how can we get to the designed features. Ideally, all requirements would be satisfied with one batch. However, in the real world, we always need to adjust our plans and expectations due to time and budget limits. Therefore, before we get down to the actual system, we need to think about, from a pure engineering perspective, the steps to get our goal.

a. What’s our overall strategy? Do we need a quick or sustainable solution? How about our cost limitations?

b. Define the basic feature set for our MVP.

c. Define our development plan in phases. Initially, we may focus on phase I, but we definitely need to keep future phases in mind.

d. Consider our development velocity and risk management strategy. How fast can we reach our goal? When shall we check the progress? And, how to manage the risk if something went wrong during the development process? Is there a Plan B?

System overview

We start with interfaces, define their semantics, define the type of the interfaces: Are they APIs or CLIs, or UI? If APIs, what kind of API shall we proceed with? For example, are they RPCs, or REST? And why?

We can then start our design, split up components, and do the drawings. One quick comment here is that our initial design should be the one that satisfies our requirements. I do not recommend a “land and then expand” strategy to gradually discover new bottlenecks that should be covered in our initial requirements. Past engineering and interview experiences taught me hard lessons on that.

Once you reached this step, congratulations, you’re on your way to good design!

Final Thoughts

Thank you very much for keep reading until this line. If you felt this is helpful, just don’t forget to add a clap! (This is the first time I use Medium and hopefully this is how things work.) I understand some readers may feel disappointed that this is not THE magical article that suddenly makes you understand all the concepts we mentioned above. However, I hope it serves as a good framework, and starting point, to help you glide through your next design.

All of the contents come from my past experience, so some of them may be wrong (or may even be ridiculous). If you have any comments, critiques, suggestions, or find any mistakes, please do not hesitate to comment inline or send a response. I’m more than happy to connect and discuss.

Disclaimers

  • The model discussed in this post is not intended to be the ultimate solution to system design discussions. It solely serves as an outline, or a starting point, to systematically initiate all related design discussions.
  • The points in this article only reflect my own view. This article has absolutely NO relationship with any of my previous or future employers. Here let’s limit our discussion to the pure technology side.
  • I reserve the right to add new disclaimers.

--

--

Li Lu
The Startup

Apache Hadoop PMC member, Ph.D. in Computer Science