System Design: Database Replication (Part 1)

6 min readApr 2, 2020

This is my first blog in the system design series on which I am currently working. So in this blog, we are going to discuss the in-depth details of one of the most important system design principle that is Replication.

I am assuming that you are aware of basic system design terms like Latency, Availability, Fault tolerance, Data throughput, Scalability, Data Consistency, Durability, etc. If not then I will highly recommend reading about them from here or let me know in comments and I’ll try to cover some in my future blogs.

In this blog, we will discuss below-mentioned topics in-depth

What is Replication?
Why do we need Replication?
Types of algorithms for implementing Replication?
Types of Replication strategies with its advantages & disadvantages?
How to handle node outages in Replication?

What is Replication?

Data Replication is the process of storing data in more than one site or node. It is simply copying data from a database from one server to another server.

Why do we need it?

So the first question coming to your mind is why do we need it? So let’s create some scenario which will tell the necessity of it (as necessity is the mother of all inventions :P )

Let’s say If I am the CEO of some e-commerce startup based in the USA. Now I want to cater to orders globally like from people living in Antarctica :P But my servers are running in the USA. So when a user comes to my application and searches an item it gets a lot of time to display the final results on his browser. I don’t think increasing resources on my server is going to solve this problem. Isn’t it?

So what’s the root cause of the problem here?
Yes, You got it right. it’s a network latency issue. So by replication, we can keep data close to the user’s geographic area to reduce latency. Similar things are done by Netflix and Amazon prime video etc that’s why you people can watch HD movies without any buffering delays.

Also, you can check the actual latency numbers here.

Similarly, other problems which replication caters are :

Availability & Fault toleration issues
Data throughput issues

Let’s again consider a real-time scenario. So as a CEO of an e-commerce startup, if my application goes down (Unavailable) then my whole business could get impacted and I could have lost thousands of bucks and can lose the trust of our customers which is very bad these days. Hence we need better Fault tolerance in our system.

Even if my startup gets very much popular like Amazon etc then my server would be handling millions of transactions/queries per second (TPS/QPS). And if my system is not that much scalable then user experience would be deteriorated. The quality of service (QOS) will be very poor. No one would like to visit our application.
Hence we need to make our system more Scalable and need to have better Data throughput.

Types of algorithms for implementing Database Replication?

Single Leader Replication
Multi-Leader Replication
Leaderless Replication

Now let’s discuss the algorithms in detail.

Single Leader Replication (Active-Passive or Master-Slave Replication)

So in leader based architecture, client (application server) requests are sent to leader DB first and after that leader sends the data changes to all of its followers as a part of the replication log.
Whenever a client wants to read data from the database then it can query either leader or any of the follower (Yes there is generally more than just one follower to make the system highly available). However, writes to the database is only accepted on the leader by the client.
Now whenever a follower dies our application will not get impacted as there is not just a single node of data. Our application can read from other followers as well and hence this makes our system highly Read Scalable

Used in: PostgreSQL, MySql, SQL Server, MongoDB, Kafka, etc.

A user makes a write, followed by a reading from a stale replica

One of the crucial points you might be thinking is that what if there is a delay in replicating data from leader to follower and a client reads from a follower?
Yes, there is a high possibility for this scenario to occur but stale reads are acceptable and this is inevitable because as per the CAP Theorem given by Eric Brewer (VP of Infra at Google), you can achieve only two guarantees out of Consistency, Partition Tolerance, Availability at a time. Moreover, You can read more about the CAP theorem here.

Also here as per Martin Kleppmann, Partition tolerance is inevitable (means systems which are Consistent and highly Available aren’t practically possible) and hence we can achieve either Consistency or Availability which means for some time you are going to show old data to the user. This is known as Eventual-Consistency. Even read-heavy systems like Facebook or Instagram are eventually consistent.

Even in terms of replication strategies I have seen experts using below-mentioned strategies type :

Synchronous replication (Follower 1 in the below image)
Asynchronous replication (Follower 2 in the below image)

Leader based replication using sync and async replication strategies

Synchronous replication strategy in Master-slave architecture

In this strategy, followers are guaranteed to have an up to date copy of data (Strong Consistency) which is an advantage.
But one of the biggest disadvantages it has is that it will block the client until the leader receives the OK response from all the followers. Now if you have a very high read scalable system like Facebook with thousands of followers nodes then waiting for data to be replicated at each node is not a good user experience.

Asynchronous replication strategy in Master-slave architecture

As we can see follower 2 in the above image, this strategy will not block the user/client for the response when they want to write data to the leader but yes it comes with the caveat of eventual consistency.
One disadvantage I can think of this strategy right now is that if by some issue followers aren’t accepting changes from the leader (might be network problem) then any new writes by the client will be lost if the leader itself goes down. Hence writes are not Durable sometimes which is a trade-off in this strategy.

Don’t worry we will discuss this problem in-depth with its possible resolution in real-time.

Handling Node outages in replication

So there are two scenarios as mentioned below

Follower failure
Leader failure

Follower failure

So in the scenario of follower failure, we can use a strategy called Catchup recovery. In this strategy, the follower (which got disconnected) can connect to the leader again and request all data changes that occur when the follower was disconnected.

Leader failure

Now in the scenario of Leader failure, we can use a strategy called Failover. In this strategy, One of the followers needs to be promoted as a new leader.

Now you people might be wondering which follower to make the leader? How is it decided?
So yes there is a voting algorithm which is called Consensus Algorithm. In layman terms, in this algorithm, we have Quorum of followers and all the followers decide which follower should be made a leader.

Going further in-depth for this algorithm is overkill and above the scope of this blog.

Bibliography

https://martin.kleppmann.com/2017/03/27/designing-data-intensive-applications.html