Synchronous Multi-Region Replication With Raft and Shared-Nothing Distributed Architecture

Denis Magda
2 min readNov 11, 2023

How have consensus algorithms and shared-nothing distributed architecture made synchronous replication across multiple regions a viable option?

Imagine deploying a distributed shared-nothing database across multiple regions in the US West, Central, and East. You set the replication factor (RF) to 3, ensuring each region holds a consistent copy of your data. Importantly, you designate the West region as the preferred, meaning the primary copies of the data are on nodes in this region.

The round-trip latency between West and Central is 45 ms, and between West and East is 65 ms.

Now, consider a client in the West sending an update to a database node in the same region. The total latency time to commit this change across regions becomes a key question.

The round-trip latency between the client and the West node is approximately 5 ms.

The West node employs the Raft consensus algorithm to commit the update. With RF=3, two nodes from two different regions must concur with this change. Naturally, the West node accepts the change and broadcasts the update to Central and East. A node in Central will acknowledge the update within 45 ms.

Once this happens, the transaction is deemed complete, and Raft no longer waits for a response from the East.

Overall, the total latency will be around 50 ms: 5 ms (client to West node) plus 45 ms (West node to Central node). This total latency could increase to 70 ms if the Central region faces an outage, requiring a quorum between West and East.

The crux here is having a preferred region to minimize cross-region traffic, aiming to keep the total latency for writes within a 50 ms SLA. Is this write latency and multi-region setup worthwhile? It depends. It’s certainly beneficial if:

  • You require a consistent copy of data at all times in all locations.
  • You aim for an RPO (Recovery Point Objective) of 0 (no data loss) with an RTO (Recovery Time Objective) measured in seconds for all kinds of outages.
  • You need low-latency reads (5 ms or less) across all regions.

--

--