Single leader replication: Setting up followers

Srikant Viswanath
Architectural Tapas
2 min readJun 9, 2020

Checkout this quick overview of replication.

Ok, so you have database setup with one leader and couple of followers using semi-synchronous replication. Looks like you have your business growing and you want to handle all that new read requests by setting up a new follower so it can help load balance the requests for other 3 nodes.

Our goal is to have the new follower eventually have an accurate copy of the leader much like other followers. When this happens the new guy is said to have caught up

In this post, lets look at what is a good high level strategy to go about it

Brute Force 1: Lock ’em up

We could certainly bring out the lock sledgehammer here, i.e., apply a write lock on the leader so no new writes enter the database (remember this is a single-leader replication scenario so writes only come thru the leader). Obviously, this would make any new writes not possible.

Digression: Imagine seeing an error message — “You can’t post on your friend’s timeline because Facebook is adding a new replica to the timelines database”, if you manage to hack enough to see the true reason behind the error that is.

We don’t want a new follower at the cost of high availability

Brute Force 2: Cmd C + Cmd V

A simple copy of data files from the leader wouldn’t quite be functional since data would constantly be in flux. So we might be seeing different parts of the database(change) at different times which wouldn’t necessarily be eventually serializable.

There is a chance you might feel a bit frustrated

Snapshot isolation to the rescue

  • We could start by taking a consistent snapshot of the leader’s database so we don’t face the scenario above. This is known as snapshot isolation w.r.t “I” in “ACID”.

Digression: In a nutshell this would attempt to take an immutable snapshot of the database so it sees a consistent view. It also annotates it with a version vector so that it is easy to compare with other version vectors and determine the order of execution (or concurrency). This is called Multi-version concurrency control(MVCC) in databases which behind the scenes use a more fundamental concept of version vectors

  • Copy over the snapshot to the new follower. A snapshot as implemented by MVCC is characterized by an always increasing transaction id txid .
  • Once the snapshot is copied over, it is time for the follower to catch up since the snapshot was take. Based on the txid present in the snapshot, the follower can then connect to the leader and ask for all writes from the leader’s replication log
  • Once it has caught up, it can behave other followers, i.e., receive replication logs from leader in “real-time”

--

--