Day 1 of System Design

Apoorv Bedmutha
3 min readMay 31, 2023

--

Photo by Kelly Sikkema on Unsplash

Every large project starts with a piece of code that can be used by some set of users.

Now to make this code accessible to the consumers we will need to connect to a network.

In most cases the consumers are at geographically different locations as well as the no. of consumers can also be large.

Hence very often the network used to make our code accessible to consumers is the internet.

Now both consumers and the developer are connected to the internet, so how will they access the piece of code which is currently located in the developer’s local machine.

To achieve this, we have to build API which the consumer’s can use.

APIs are set of protocols that allow 2 softwares to communicate with each other.

Now the consumers can successfully access the developers code whenever they wish to, but what if the developers machine shuts down or crashes?

In such a scenario the whole system will come to a halt.

This is called a single point of failure problem.

To solve this we have take help from cloud service like AWS or GCP or Azure.

These cloud services rent us their servers which are accessible through the internet.

It is sensible to at-least rent 2 servers to facilitate the consumers, so that in case one server crashes the other server can take care of the load.

The servers will also need some form of shared memory for storing our shareable code.

In case where organized data storage is needed, a DBMS is attached such as SQL, DB2, etc.

Coming back to APIs, The consumers make a request to the server to access the shared program. Based on the request the server will give appropriate response.

as the no. of consumers increase, so does the no. of requests made to the servers.

Since we have multiple servers, these requests are distributed among them.

The amount or traffic of requests made to the servers is known as load.

If this load is not distributed properly among the servers it may lead to problems like starvation, idling, etc.

Hence the process of load balancing, i.e. the method of distributing network traffic equally across a pool of resources ( in our case the servers ) that support an application is employed.

Next question that arises is what if the number of consumers / number of requests to the servers increases to the extent that the existing servers are not able to handle the loads even at full capacity. This in terms of networking is also known as congestion.

To solve this problem, their are three approaches:

  1. To upgrade our servers with better processors, more memory, etc
  2. To increase the number of servers.
  3. To upgrade our servers and increase the number of servers.

The first approach is known as vertical scaling and the second approach is called as horizontal scaling.

Which is the right solution for you depends on the project’s requirements.

Certainly there are many aspects yet to be discussed.

--

--