Distributed Systems 101

Sanatan Shrivastava
5 min readJun 23, 2020

What we are going to read here, is the whole and soul of the current giant tech companies of the world. The Distributed System is what lets them be the winner in the play.

No doubt, that had these large distributed systems been not into existence, companies like Google, Amazon, and others wouldn’t be handling billions of requests at a time, but they do because of having the best and the most complex systems in the play behind the scenes.

But all these companies started with minimal resources, right? Some of the best from the “Garages” to be more precise. They had very poor tech stack, to be honest.
But they survived it, How??. Well, the answer is they didn’t need it at the time. They were more about putting their idea into action rather than iterating it more, and thinking about it even more.
Spending more time than needed on designing rather than coding can lead to failure.

Let’s see how distributed systems today is the anchor that holds the world firmly.

So, what is a distributed system?
There is a library full of definitions on distributed systems, concluding them all, we can say that

“A distributed system is a collection of autonomous computing elements that appears to its users as a single coherent system.”

How we collaborate with various autonomous users (be they be people or applications) together is the very idea of the development of a distributed system.

A distributed system is like a group of computers working together, but appearing as a single computer to every user using it. So when you are using, anything on the internet, millions or billions might be using it at the same time, but it doesn’t affect your experience. The machines that constitute a D.S are concurrently working in a shared state, but still are independent in their own working, we can say they are like bulbs in the parallel circuit. The ease of maintainaence increases multiple-fold times.

A traditional Stack
A Distributed Database System

Now to make this a distributed system, we would need this same application running with different databases, connected to the same application in such a way that even if the query is made to DB_1, DB_2/DB_3 must be able to send a response to the query.

But why need the distribution system at all??
Let’s put it in a broader sense,
1) It enables you to scale the system horizontally as well as vertically.
Horizontally Scaling: If you are increasing the number of computer systems, i.e not just upgrade the hardware of a single computer system but the number of similar applications, that would be called the process of horizontal scaling.
Ex: MongoDB, CassandraDB, Google cloud etc.
Vertical Scaling:
If you were to increase the number of databases in terms of hardware, to handle more traffic, as we discussed in the upper example when we distributed a single database into three databases.

Up until now, although the horizontal scaling may look expensive as compared to vertical scaling, studies have shown that after a certain threshold, horizontal scaling after a certain point is cheaper than vertical scaling.

vs Vertical Scaling v/s Horizontal Scaling

Now imagine, if we keep on increasing the number of decks on the bus, it would rumble after a certain increase. Hence, we see how vertical scale is lesser useful than Horizontal Scaling.

Technical Aspect of Scaling.

2) Low Latency: Did you ever wonder, how Amazon and Google, though being in the USA serves their clients across the world with the same way they serve their US clients, well the distributed system has made that possible for the companies to distribute and serve their data to their clients anywhere across the world. So distributed systems ease the coverage and easy transfer of data by allowing the dispersion of servers in both the locations, routing traffic to access the node that is closest to the request source.

3) Distributed Systems maintains the ease of maintenance and fault rectification.
as discussed, Distributed Systems has many nodes of computer systems working in a concurrent manner, and still are independent in their operations. The failure in any node or a computer system is not precedented by failure in another which leads to fast fault tracking and correction with no loss in overall operations.
this leads to nominal growth in the operating capabilities of any system.

There is another technique called Sharding, which is more or less scaling the datasets.

Decentralized vs Distributed Systems:

*Server for their decentralized Internet was based inside their GARAGE*

Do you remember the episodes from Silicon Valley where Pied Piper conceived a whole decentralized Internet??
So, they took down a giant company like Hooli, just because Hooli had no decentralized system backing them?!!!
Well, that was not just a storyline, that can happen to any company that has no D.S. in today’s world. You can see any startup pushing their IT on AWS or GCP to have an adamant growth in terms of IT, because its the necessity.
Decentralized is more or less a part of Distributed Systems, in decentralized systems, the system is not owned by a single entity (company) else it would be decentralized anymore, but we can say that the working is no lesser than distributed systems, and so are the advantages. Although it is much harder to set up a decentralized system than a distributed system because of the resource sharing and allocation problems.

It’s more than satisfactory to say that with the rate of data generation, we need a more strategic way to handle this vast amount of Data, and Distributed System is the key to it.

LACK OF A SOLID DISTRIBUTED SYSTEMS, PERHAPS??

So, are you ready to build your project with more than a single database just yet?

But remember, the complexity that comes up with creating a Distributed system is way too much than developing it. So unless you have at least 200 employees working, just pick up the phone and rent out a good cloud computing service.

--

--

Sanatan Shrivastava

Interested in Data Structures and superfast Algorithms, building awesome websites and Cloud Computing! Feel free to reach out for a project :)