System Design 1O1: The magical word — “Scalability”

Published in

The Startup

4 min readMay 7, 2020

Ever wondered how applications such as Facebook, Instagram, WhatsApp, Netflix, etc. seamlessly serve billions of users on their platform every day? No matter how good these applications look, in the end, it all boils down to the underlying hardware on a machine for processing a user’s request, be it simply sending a message or just scrolling through the timeline. Unsurprisingly, the hardware will always be limited on any given machine; thus, at any given time, a software system can serve only a limited amount of user requests beyond which it will eventually slow down or even crash. So, what’s the secret sauce modern applications have that allows them to flawlessly handle billions of user requests every day? — ‘scalability’.

Scalability (noun) — The ability to be changed in size, power, capacity, etc.

There are two ways of scaling a system when number of user requests increases or decreases –

Increasing/decreasing the capacity of specific system components (vertical scaling)
Increasing/decreasing the number of specific system components (horizontal scaling)

Let’s quickly explore both of them.

Increasing/decreasing the capacity of specific system components (vertical scaling)

We refer to this approach as vertical scaling or scaling up/down.

Scaling up — Upgrading the capacity of a system component to handle the increased load on that component. For e.g., in response to an increase in user requests, scaling-up our server machine will mean increasing the RAM size or maybe upgrading the processor on our server machine.

Scaling down — Downgrading the capacity of a system component when the load on that component decreases. For e.g., in response to a decrease in user requests, scaling-down our server machine will mean decreasing the RAM size or maybe downgrading the processor on our server machine.

This approach keeps our system design simple, with no changes to our existing application. It just involves altering the capacity of our existing components for handling the requests, thus no major system design complexities involved. On the flip side, this approach will scale our system only up to a certain extent as there still exists an upper limit on the capacity of every system component.

One more downside of this approach is that it creates a single point of failure in our system. For eg., as per this approach, we’ll have just one server machine which we would scale; if it goes down for some reason, maybe due to power-cut or something, our entire application would crash.

Increasing/decreasing the number of specific system components (horizontal scaling)

We refer to this approach as horizontal scaling or scaling in/out.

Scaling out — Adding additional instances of a system component for handling increased load on that component. For e.g., in response to an increase in user requests, scaling-out our server machine will mean adding additional server machines in our system with each machine having the same configuration.

Scaling in — Removing additional instances of a system component when the load on that component decreases. For e.g., in response to a decrease in user requests, scaling-in our server machine will mean removing previously added machines from our system.

At any given time, the load is divided among all the instances of the scaled component, thus allowing each instance to cope up with just a smaller chunk of the total load.

In contrast with vertical scaling, this approach tends to add complexity to our system design. Things such as how requests would be divided among the instances need to be addressed in the design. On the brighter side, this approach allows our systems to scale almost indefinitely as there is no limit on the number of component instances we can add. It also eliminates a single point of failure; hence, the failure of a single instance of a component cannot bring down the entire application.

So, which approach is better?

There are pros and cons to both the approaches; hence, modern applications tend to consider a hybrid approach to get the best of both the worlds. A hybrid approach can be thought of as having a horizontally scalable system component where each instance of the component is vertically scalable. There will still be trade-offs, such as design complexities, but hey, nothing’s perfect!

I hope you got to learn something new today. This was a very high-level overview of the vast realm of system scalability. In the upcoming articles, I’ll dig down into more technical intricacies of designing and implementing scalable systems. If interested, make sure to hit the follow button.

If you liked this article, leave a clap, and let me know your thoughts, queries, or suggestions in the comments section below. 😃

System Design 1O1: The magical word — “Scalability”

Written by Faisal Sheikh