Distributed Systems

System Design
SystemDesign.us Blog
4 min readAug 8, 2022

A system where processing of data and business logic is divided between one or more computing units connected over a network and each of these units serve possibly a specific and different purpose can be described as a distributed system.

What makes designing distributed systems so complicated is the knowledge required about the sheer number of different areas including system architecture, networking, storage, security and more. Although this process is similar to designing non-distributed monolithic systems, there are some differences that fundamentally affect the distributed system design.

What exactly is distributed computing?

Distributed computing involves a digital hardware and software system that is connected together in a network and solves a given computational task by making all the different pieces of the system work closely together. This work is generally not redundant where the computational task is divided between different parts of the system.

One of the consequences of using hardware that is connected in a network is the absence of directly shared memory and computational power. Since they are physically separated, information needs to be passed over the network to keep all the pieces informed of their role in the form of messages.

Externally, a distributed system looks as a single unit but is made up of a lot of different pieces working together. It decreases the complexity of individual pieces of technology but increases the overall complexity of the system, as compared to monolithic systems.

What are the advantages of distributed systems?

The main goal of the distributed systems was to make the system more scalable, available and performant.

  • Horizontal Scaling — Scaling of resources is more predictable (linear if designed correctly) and relatively inexpensive, as more resources can be added to bolster various different parts of the system.
  • Reliability — The system as a whole becomes more reliable even if some part of the system experience failure. Because the rest of the system can keep servicing some functionality. Whereas when a monolith goes down, the whole operation comes to a halt.
  • Low Latency / Performance — Distributed system can be made more efficient by dividing the workload across smaller parts of the system. Another example is geographically separated distributed parts can reduce the latency for the user in that geo.

What are the disadvantages of distributed systems?

As with anything in life, there are downsides to building the system using this method.

  • Data Consistency / Latency — A distributed system requires extra effort to maintain the consistency of data because the memory and computing resources are distributed across the network which by definition are harder to co-ordinate with than a monolithic system where all the hardware is in the same system. A distributed system generally has to make tradeoffs between availability, consistency and latency.
  • Network Failures — Network equipment fails surprisingly often. And to guard against this kind of failure and to make sure information is delivered to the correct nodes in the expected order, significant amount of effort must be spent for things like disaster recovery.
  • Scheduling / Management — Deployment of new changes and regular maintenance activities require more work as well when compared with monolithic systems. Because different parts of the system can to be deployed/maintained separately in a backward compatible way.
  • Metrics Management — Monitoring, logging and usage metrics require separate systems as they become quite complicated as well to develop and maintain.

What is the point of all of this?

Now, all of the above description sounds very complex. So why are different teams choosing to take this route? Think harder. As most things in life the biggest benefit that people have observed is LESS COST. As a monolithic system gets bigger in terms of its usage and scale, it becomes harder to build, change and maintain. In case of a distributed system, individual smaller parts can be developed and updated on their own separately and even with the help of smaller teams of people who are responsible for individual components. The system can be scaled at a lower cost.

Let us look at the following chart.

Because of the complexity of a distributed system, initial setup costs for this system is slightly higher as compared to a simple monolithic system. However, as these systems become more complex, costs for the monolith rises at a faster rate than the cost for horizontally scaled distributed system.

A large amount of internet infrastructure is now built on top of distributed systems and it is imperative for any software engineer to understand how they work.

--

--