Tech Wrench

Tech tales, anecdotes and musings…

Consistency & Consensus for System Design Interview (3): broadcast order

--

PREV | HOME | NEXT | System Design Resource List

Don’t forget to get your copy of Designing Data Intensive Applications, the single most important book to read for system design interview prep!

Check out ByteByteGo’s popular System Design Interview Course

Introduction

Consider the following pseudocode snippet


int x = rand() // returns a random integer

if ( x mod 17 == 0) {
x = 25;
}
Grokking Modern System Design for Software Engineers and Managers

The snippet demonstrates two writes to the variable x, the first one assigns a random value to x and the second one executes if x is a multiple of 17. The following is the sequence of reads and writes that occur for the above snippet in case x is a multiple of 17.

We can say that the second write is causally related to the first write. The second write only happens after the first write. In other words, the second write is aware of the first write and the two aren’t concurrent. There is a cause and effect relationship between the two writes. In general, given any two operations A and B, the following relationships between the two are possible:

  1. A happens before B
  2. B happens before A
  3. A and B are concurrent

In distributed systems it is essential to capture the ordering and causal relationship among operations.

Interviewing? buy one month access to our number#1 course for Java Multithreading Interviews !

If the above snippet stores the value of x in a distributed database, then every replica should preserve the order of the writes otherwise some replicas can end up having the value of x set to a multiple of 17 instead of 25. Even if operations don’t have any causal relationship, the order imposed by conflict resolution must be preserved by all the replicas. In this example we stored the value of a single variable x, in practise x can be a key in a key/value store, a row in a relational database or a document in a document database such as MongoDB. In the context of distributed systems, what we are storing, e.g. x in the previous example, is referred to as a register.

Land a higher salary with Grokking Comp Negotiation in Tech.

In this course, we have already seen some examples where causal relationships are preserved. For instance, the single-leader replication preserves the order of writes in the replication log. The concept of serializability that we discussed in the context of transactions ensures that the impact of running transactions concurrently is as if they were executed serially, one by one. Similarly, the use of timestamps and clocks in distributed systems allows us to order multiple operations happening simultaneously. Another example is of serializable snapshot isolation (SSI) that prevents write-skew by tracking causal dependencies among transactions. In fact when we refer to a consistent snapshot that transactions can read from to avoid read skew or non-repeatable reads, the consistent refers to causal consistency. The snapshot in time of the database consists of all the effects of operations that were executed before the point in time when the snapshot is taken and no operations thereafter.

Ace the machine learning engineer interview with Grokking the Machine Learning Interview.

Total Order

Before we delve further into causal consistency, we’ll learn about total order. Consider the set of natural numbers { 1, 2, 3, 4, … }. If we pick any two numbers out of the set, we can compare them. We can say one is smaller or larger than the other. This is known as total order where the constituent elements can be compared against each other. Each element is either greater or smaller than any other element in the set of natural numbers (Totality). If 2 < 4 and 4 < 7 then we know that 2 < 7 necessarily (Transitivity). And finally if 3 < 5 then 5 can’t be less than 3 (Asymmetry).

Check out the course Coderust: Hacking the Coding Interview for Facebook and Google coding interviews.

Partial Order

Elements of a set will exhibit partial ordering when they possess transitivity and asymmetry but not totality. As an example think about your family tree. Your father is your ancestor, your grandfather is your father’s ancestor. By transitivity, your grandfather is also your ancestor. However, your father or grandfather aren’t ancestors of your mother and in a sense they are incomparable.

Linearizability vs Causality

A linerizable system imposes total order on all the operations that execute on it. Any two operations on a linerzable system have an order defined between them, where one occurs before the other. This is because conceptually a linerizable system behaves like a single copy of data where operations are executed atomically.

In contrast a system adhering to causal consistency imposes partial order on the reads and writes that execute against the system. Causally related operations have an order between them but concurrent operations are incomparable to each other and neither can be said to occur before/after the other.

Ace Multithreading and Concurrency questions with the number#1 resource for interview prep.

Another way to visualize the difference between a linearizable system and a causality consistent system is to think in terms of a timeline. For a linerizable system all operations occur on a single timeline at a single point in time as shown below:

The system may receive multiple requests at the same time but all of them are handled in a total order along a single timeline without concurrency. In contrast, in a causally consistent system, the timeline branches into multiple lines that merge later. The operations executed on these parallel lines are incomparable and concurrent.

Get a leg up on your competition with the Grokking the Advanced System Design Interview course and land that dream job!

If you are familiar with Github, the above depiction is much like branches that are taken out from the main (or master) branch by different folks to work concurrently and then eventually merged back to the main (or master) branch.

Grokking the Coding Interview: Patterns for Coding Questions

It should also be noted that linearizability implies causality, i.e. a linearizable system also preserves causal relationships and thus is causally consistent. Linearizability comes at a cost which increases as network delays in a system increase. Most applications can do away with linearizability anyway and only require causal consistency, which can be achieved by a distributed system even in the face of network delays. In fact, causal consistency is the strongest form of consistency that isn’t affected by network delays and remains available with network failures.

If you are interviewing, consider buying our number#1 course for Python Multithreading Interviews.

Your Comprehensive Interview Kit for Big Tech Jobs

0. Grokking the Machine Learning Interview
This course helps you build that skill, and goes over some of the most popularly asked interview problems at big tech companies.

1. Grokking the System Design Interview
Learn how to prepare for system design interviews and practice common system design interview questions.

2. Grokking Dynamic Programming Patterns for Coding Interviews
Faster preparation for coding interviews.

3. Grokking the Advanced System Design Interview
Learn system design through architectural review of real systems.

4. Grokking the Coding Interview: Patterns for Coding Questions
Faster preparation for coding interviews.

5. Grokking the Object Oriented Design Interview
Learn how to prepare for object oriented design interviews and practice common object oriented design interview questions

6. Machine Learning System Design

7. System Design Course Bundle

8. Coding Interviews Bundle

9. Tech Design Bundle

10. All Courses Bundle

--

--

Tech Wrench
Tech Wrench
SystemDesign
SystemDesign

Written by SystemDesign

The ultimate Poor man’s system design interview prep guide -- https://systemdesign.medium.com/membership

No responses yet