Distributed Systems — A Many-Faced God

Nimesh Khandelwal
Analytics Vidhya
Published in
6 min readJun 8, 2019

--

A brief introduction to the realm of Distributed Systems

With the ever-growing technological expansion, distributed systems are more widely used and are worshipped by all big tech companies. Most of the applications we interact with nowadays are actually served by the system of computing elements acting like one.
Various questions come in one’s mind while discussing distributed systems, like what are distributed systems, why use them and much more.
To understand the distributed system we need to first know what was before that and why was there any need for systems like that.

The time before distributed systems — the lone wolf

Multiple Client Hitting one server

Before the distributed system came into the picture, there used to be a single system that did not communicate with others and functioned on its own. It meant that when you requested a resource from the server, it would always reach a particular node/system to serve that request.

It can be easily understood that such a system won’t survive as when many requests start coming to the server, its CPU usage will increase and the system will crash. There is a famous saying — When the snows fall and the white winds blow, the lone wolf dies but the pack survives.

So people start realizing that a single system can’t be used to serve all requests coming to their servers, they need something that will scale to cater to their load on servers.

But a genuine question might come to one’s mind, that if your CPU Load is increasing or RAM is becoming too low , why not work towards increasing capacity of our machine (commonly known as vertical scaling) and take load of managing a distributed system(which is a lot tricky and comes with its own problems)?

Why prefer horizontal scaling(distributed way) over Vertical scaling(upgrading system)

It's true that managing distributed systems are a complex topic chock-full of pitfalls and landmines. It is a headache to deploy, maintain, synchronize and debug distributed systems, so why go there at all?

Scaling vertically is good while you can, yet after a specific point, you will see that even the best hardware isn’t adequate for enough traffic, not to mention impractical to host.

Scaling horizontally simply means adding more computers rather than upgrading the hardware of a single one, therefore, it theoretically has an infinite potential to scale.

Even cost is a great factor why systems can’t scale vertically after a certain threshold.

Vertical scaling become lot expensive after a certain point

Fault Tolerance- A good reason why a distributed system is more preferable. A cluster of 10 nodes is always more fault tolerant than a single machine of any configuration.

Low Latency- With systems like CDN, content get served from your nearest node, thus reducing time to get resource from servers.

How Distributed System Works

In the current world, almost all big companies are running their services through distributed systems. If you’ve ever played a multiplayer game online, booked a cab, posted on Facebook, streamed a Netflix show, or bought a book on Amazon — you relied on a distributed system to do it.

A distributed system is nothing more than multiple nodes that talk to one another in some way, while also performing their operations. A node can be anything ranging from a full-fledged system to a small sensor device as long it is autonomous and has a way to communicate with other nodes.

For a distributed system to work, though, you need the software running on those nodes to be specifically designed for running on multiple computers at the same time and handling the problems that come along with it. This turns out to be no easy feat. We will talk about these softwares in detail in other upcoming posts.

Let's understand the distributed system functioning through an example. Consider a website that uses a database system that currently consists of one node whose capacity to handle a request is X. Now load on your website double-fold, which in turn increases database load and our current single system can’t handle it. Our application would immediately start to decline in performance and this would get noticed by our users.

Now let's solve this problem through distributed systems. Initially, it might seem intuitive that, all we need to do is just add a node in our cluster and boom requests get handled by both servers and problem solved. But as easy it might seem, it's tricky and needs to handle various cases.

In a typical web application, normally reading information is much more frequent than insert or modify an old one.

There is a way in distributed systems to increase read performance and that is the so-called Master-Slave strategy. Here, you create two new database servers that sync up with the main one. These new slaves will only read from these new instances.

Write only on master and read from all

So now whenever new information is inserted, the request will be always redirected to a master database and master will asynchronously sync up with all slaves to update data.

Congratulations, now we can execute 3x as much read queries! Isn’t that great?

But wait, here is where you enter a distributed system pitfall

We immediately lost the C in our relational database’s ACID guarantees, which stands for Consistency.

When we insert a record in the master database, there is a possibility that during a time when slave syncs up, someone request read from that slave and will be informed with stale data. This is a trade-off you take while improving your performance. You can solve this problem by delaying your write operation and make users wait to entertain any request till the databases syncs up, but as stated it will impact performance. It's quite common in a distributed system to work along with such trade-offs. There is a famous theorem which talks about trade-off in distributed systems called the CAP theorem. I will try to cover it in detail in subsequent posts.

Now we have scaled our system for reads, but if we want to scale it for write operations, we can do so by using multiple masters, but that will surely create a lot of problems such as creating conflicts(e.g insert two records with the same ID).

As you might have noticed that even though in the background it seems like there are multiple systems handling client requests, but to a client, it will always seem to be handled by one and that's the beauty of distributed systems.

This is just one application of Distributed Systems, there are a lot more other ways of applying distributed systems in the real world like HDFS, Queue Systems, Map Reduce, etc. You should read about these to get more clarity on how these works.

By now, you should have a very clear idea of how a simple distributed system looks like, and why to go on with it.

I know this article is quite short, but there is a lot into the world of a distributed system to cover in just one article, thus will try to accommodate further in subsequent articles. Don’t forget to follow to get updates on further articles.

Valar Morghulis !!

Thank you for reading the article and please provide your feedback and comments.

--

--