Vector Clocks

Sriram R
2 min readFeb 27, 2023

--

The main limitation of Lamport Clocks was that they were unable to assist us in determining the causal relationship between two events.

If you’re not sure what a causal relationship is, read my article on Message Ordering first.

The reason for this is that we had a single timestamp that was being updated globally, which did not provide us with a complete picture of the sequence of events that occurred.

Vector Clocks

We don't keep a single timestamp for each node. Instead, we keep a list of timestamps, one for each node in each node.

For example, if we have a three-node system, the vector timestamp would look like [ T1, T2, T3 ], where T1 is the logical time of Node 1, T2 is the logical time of Node 2, and T3 is the logical time of Node 3.

The entire list forms a single logical clock, known as vector clocks.

How do Vector Clocks work?

  1. When a node boots up, it determines the number of nodes in the cluster (N) and fills an array of size N with 0's.
  2. Before executing an event, the node increments the clock of that node’s time in the list. N[i] = N[i] + 1
  3. When a message is sent, the time for that node is incremented, and the vector clock is attached to the message.
  4. When a message is received,
    1. It performs the action
    2. Update each list element by taking the maximum of that element’s own list and the list received.

We say Event A caused Event B if A’s vector timestamp is strictly less than B’s vector timestamp.
We can also say that Event A and Event B are concurrent if their timestamps cannot be compared.

Pseudocode

on initialisation at node N[i] do
T := [0,0,..0] //One element for each node
end on

on any event occuring at node N[i] do
T[i] := T[i] + 1
end on

on request to send message m from node N[i] do
T[i] := T[i] + 1
send_message(m, T)
end on

on receiving message (m, T') at node N[i] do
T[j] := max(T[j], T'[j]) for j in (1..n)
T[i] := T[i] + 1;
process_message(m)
end on

Issues with Vector Clocks

The main problem with vector clocks is that they require a large amount of storage because each node must store N timestamps based on the number of nodes.

Furthermore, as the number of nodes increases, we will be sending a significant amount of data because the entire vector clock with all node timestamps must be sent even if only two nodes are interacting.

--

--

Sriram R

A software engineer on the way to a jack-of-all-trades.