Introduction to System Architecture Design

Backend Army
Published in
11 min readJan 12, 2019


System Architecture Design sometimes simply known as System Design is a conceptual representation of the components and subcomponents that reflects the behaviour of a system.

System architects or solution architects are the people who know what components to choose for the specific use case, make right trade-offs with awareness of the bottlenecks in the overall system.

Usually, solution architects who have more years of experience tend to be good at system architecting because system design is an open-ended problem, there is no one correct solution.

Mostly it’s trial and error experiments done with right trade-offs. So, experience teaches you what components to choose based on the problem at hand but in order to gain that experience you need to start somewhere. That’s why I am writing this article for the developers who have never done system designs.

Today you are going to learn the basic components that are necessary to get started with system architecting. And principles to keep in mind while designing such systems.

Our new e-commerce startup.

Bonus, we will learn all these by designing the architecture for a real use case. We are going to build an e-commerce platform called Nozama.

Before we begin you should know about cloud providers. These days almost everyone uses public cloud.

Cloud providers are the companies that sell managed services like on-demand servers, databases and many other components. With a click of a button or by typing a command (if you may fancy it) you can spin up managed resources offered by them.

Some popular cloud providers are Amazon Web Services(AWS), Google Cloud Platform and Microsoft Azure.

Basic concepts

  1. Latency
  2. Throughput
  3. Bandwidth
  4. Vertical Scaling, Horizontal Scaling, Auto Scaling

These are the basic concepts you need to know to get started with system architecture. I will explain them one by one.

Latency, Throughput, Bandwidth

A better way to understand latency and throughput is imagining a bunch of cars moving from point A(source) to point B(destination) through an expressway.

An expressway with a capacity of 4 vehicles

The time taken for any given car to travel from source to destination is the latency experienced by that car. Latency is always the unit of time. One can say the latency for that specific red car is 5 minutes.

The maximum number of vehicles that can travel through this expressway is called bandwidth. Let’s say maximum 4 vehicles can go through this expressway. Then the bandwidth of this expressway is 4 vehicles/hour.

But at any given time due to traffic or some other reasons, the number of vehicles travelling through the expressway reduces. Let’s say it usually get reduced to 3 vehicles/hour at peak hours. The actual number of vehicles that can travel through the expressway at any given condition (irrespective of its maximum bandwidth) is called throughput.

One should keep in mind that increasing the bandwidth affects the throughput but may not necessarily help to reduce the latency.

Now we got some requirement that we need to accommodate more vehicles in our expressway.

We will do this by scaling our existing expressway to 3 lanes. Now it can accommodate maximum 8 vehicles/hour.

Adding one more lane to existing expressway

Since our expressway’s bandwidth has been increased to 8 vehicles/hour. At peak hours the throughput becomes 7 vehicles/hour.

Even though the throughput is better now, it does not necessarily reduce the time taken by a car to get to its destination. Some cars still might experience latency because of their maximum cruising speed or patchy roads in their lane or sometimes even a slow driver.

Increasing the bandwidth affects the throughput but may not necessarily help to reduce the latency.

When it comes to computer systems, you can assume the cars are the data packets and expressway is our channel or system. The bandwidth or throughput is measured in bits/s or Mbits/s.

If you notice you just scaled the expressway by adding more lanes to the existing one. Doing this might disturb the traffic in the expressway but it scales in this scenario. In computer systems, this is called vertical scaling. When you scale vertically you add more resources like CPU, RAM and SSD to the existing machine.

You can also scale the expressway without disturbing the existing traffic. This can be done by building another parallel expressway.

Scaling throughput by constructing one more expressway without disturbing the other one

This is called horizontal scaling. In system architectures, you add more machines by just replicating them to increase the throughput of the overall system.

Now imagine, you somehow have the ability to build and destroy an expressway instantly. During peak times in order to increase the throughput, you can replicate the expressways (horizontally scale) based on the traffic. Then when traffic returns to normal you can destroy the extra expressways. This is nothing but auto-scaling.

In computer systems, based on the load experienced by our system we do auto-scaling by replicating the machines. When load reduces we remove the machines that are idle.

Now you know the very basic concepts to get started with system design. Let’s build the architecture for our e-commerce startup by going through some components offered by the cloud provider.

Basic components

System design involves assembling the right components to solve the problem at hand. There are thousands of components or stacks available from different cloud providers. But worry not, you don’t have to know all those to build a simple architecture.

We will see some basic components to get you started and while explaining them you will also eventually build an architecture for our e-commerce website Nozama. Here are some basic compnents.

  1. Virtual Machine
  2. Load Balancer
  3. Database
  4. Cache

Simple isn’t it. With these 4 components, you can build a scalable e-commerce website. Let’s get started.

Virtual Machine

Virtual Machine(VM) is the basic component of any system architecture. It’s nothing but a computer in the cloud. When you usually ssh to an IP from your local computer, you are logging into one of the machines in the cloud.

It is called a “virtual” machine because your cloud provider will allocate the resources(CPU, RAM, SSD, Network Bandwidth) from large hardware and emulate a computer with resources you requested.

ssh-ing into a VM in cloud

You can request a bare metal server too. Those servers are not emulated, the cloud provider will assign dedicated hardware for you. They come with raw power but also with a price tag.

Our single VM happily living in the cloud with nozama’s code in it.

This is going to be the base component for Nozama’s architecture. We have deployed the code in this VM by just scp-ing to it. We are launching our startup in a few minutes.

Load Balancer

A Load balancer is a component used to balance the load across VMs. Obvious duh. But let me explain how it is going to help us to handle Nozama’s traffic.

A happy customer browsing products at

Remember our Nozama store has been launched… Oh, wait there’s already a customer browsing products.

When someone visits from their browser, the request goes to VM1 right now. The customer has a latency of 2 seconds, which is acceptable.

As we speak, more people are visiting Nozama. It seems our startup is on fire. Like literally.

Nozama is on fire.

You realize what’s happening here? Our VM1 handled up to 3 customers, even there latency increased to 3 seconds. When dozens of customers started visiting Nozama, our server ran out of resources and died, because with so much CPU or memory it can handle only so much load.

Look at the latency in the 3rd panel where there are more than 3 users, it went to 15 seconds. Who would wait for 15 seconds for a website to load, our users are leaving Nozama.

To handle such traffic we are going to do horizontal scaling. Sounds familiar? scroll back up if you want to read about horizontal scaling once more.

4 servers ready to handle more traffic

We are just going to spin up more servers on the cloud. VM1, VM2, VM3, VM4, we will have 4 servers.

In order to distribute the traffic across our VMs, we have to use a load balancer.

Apart from distributing the traffic, the load balancer also keeps an eye on our servers.

Health check done by load balancer with 3 VMs behind it.

Load balancers do something called health check. For a given interval it tries to reach each one of our servers, if any of the servers is not responding it stops sending traffic to that server.

Once we have added a load balancer to the architecture. The load balancer becomes the front-facing part of our infrastructure. This also makes our infrastructure secure, because no one will be able to see the IPs of our servers, we expose only the load balancer’s endpoint.

Our machines delivering all Nozama’s traffic under 2s


We’ve been doing good progress in building the architecture of our e-commerce startup. But we have not been talking about one important component, Database. Any system that needs to store and retrieve data should have some kind of database to make our life easier.

We are not going to see why we need a database, people who are into programming should know that answer (it’s absolutely fine if you don’t know, there are tons of first-class articles which is just a google away). Instead, we will briefly talk about what kind of database to choose based on thd use case.

Primarily there are two types of databases.

  1. Non relational or noSQL
  2. Relational or SQL

Non-relational database

Column-oriented DB: Cassandra, Clickhouse and Scylladb are some of the columnar databases. They are primarily used to store and retrieve a high volume of semi-structured OLAP transactions. In column-oriented DBs each column forms a record and most of the databases support SQL like queries.

Document-oriented DB: CouchDB, CouchBase and MongoDB are examples of document -oriented DB. Here everything is stored as a document and we can use it to store the semi-structured or totally unstructured data where we don’t know the schema beforehand or the schema changes very frequently.

Graph-oriented DB: Amazon Neptune, Neo4j, OrientDB are some examples of modern graph-oritened databases. When your data can be represented as nodes and has many to many relationships between them, you should consider picking this.

Key value oriented DB: These kinds of databases stores the data in the form of hashmaps/dictionaries. They are mostly in-memory and widely used in places where high-speed data retrieval is necessary. Redis is one of the popular KV stores that support different data structures.

There are also many simple KV stores like BoltDB and RocksDB that can be embedded in your code in order to maintain some state. And not to mention robust, distributed KV stores like Memcached that’s used by several distributed systems.

Relational Database

There are many good old relational databases like Postgresql, MySQL and OracleDB. As the name indicates we use a relational database when there is some relation between the data we deal with. These databases are ACID compliant.

Any e-commerce data will have tight relationships between them. For example, one user will have multiple orders. One order will have multiple products. One user will have multiple wishlists. Again one wishlist will have multiple products etc.

Sample schema representation of Nozama
Servers establishing the connection to our new member in the architecture

Its obvious that Nozama needs a relational database and we have chosen PostgreSQL.

P.S. One should know that at one point of time the database will become the bottleneck in any architecture. But we are not going to optimize it right away.

If you start over optimizing the systems even before you face the real problem, it causes more harm than good. Keep in mind that premature optimization is bad.

Premature optimization is the root of all evil.


Cache is used to store the data which are accessed very frequently with as minimum latency as possible. Most of the caches are in-memory to provide low latency data access.

Holiday season deals on Nozama website

It’s the holiday season, and deals are going on in Nozama. The stock won’t last as people will be swarming the products page to buy the discounted items. We need to update the product page instantly when something goes out of stock.

In this scenario, the number of in-stock items for a specific set of discounted products is the data that is going to be accessed very frequently. It’s okay if we use the database to query it in normal hours, but this is the peak traffic time as the deals are going on and we cannot afford latency. Already our Postgres database will be loaded dealing with other queries.

We are going to store the frequently accessed data into a cache layer. This way we don’t have to wait for the database query which is too costly at this moment.

We are going to use Redis as a cache layer to store our stocks. While redis supports different data structures and we choose hashmap so that the cost for retrieval is O(1). Our data in Redis will look something like this.

in_stock: {<product_id>: 40, <product_id>: 10, <product_id>: 0}

So when the deals page is accessed, the stock details are taken from the cache layer instead of hitting the database.

Our current architecture of Nozama

Above is our current architecture for Nozama. Now our e-commerce website can handle 1000’s of concurrent users. In above diagram if you replace all freehand sketches with proper boxes and use some arrows, it becomes the formal architecture diagram.

Pat yourself on the back. I know it’s a long post and you came this far. You learnt the basic concepts and components of system design. You also learnt how to architect a simple and scalable solution for an e-commerce website.

I say this again, system designs are open-ended problems. The architecture we designed now might not work for all scenarios. But If you have not done any architecture designs, I am sure it would’ve given you a gist of it.

Follow Backend Army in medium and stay tuned for the next part of this article on advanced system design concepts.

Found this post interesting?
It would mean a lot to me if you could hold the “clap” icon and give a shoutout to me on
twitter. That would really make my day.

Also, you can subscribe to our newsletter to receive more blog posts and premium courses on backend engineering.
— thanks!