# Global Snapshot, Chandy Lamport Algorithm & Consistent Cut

A Global Snapshot or a global state consists of local states of each process in the distributed system along with the in-transit messages on communication channels. The snapshot problem in a distributed system tries to determine the “current” snapshot of the global state without stopping the system. A Global snapshot is important, in many scenarios such as:

• determine safety points for a distributed database
• termination of an algorithm
• garbage collection of objects
• to find out the current load of a distributed system

Two major problems in determining a global snapshot are:

1. One cannot catch all processes at the same time
2. Messages that are on the way cannot be seen

The classical algorithm that is used to determine a global snapshot in a distributed system is the Chandy-Lamport Global Snapshot Algorithm, 1985.

The assumptions of the algorithm are as follows:

• Neither channels nor processes fail, communication is reliable
• Channels are unidirectional and provide FIFO-ordered delivery
• There is a communication path between any two processes in the system
• Any process may initiate the snapshot algorithm
• The snapshot algorithm does not interfere with the normal execution of the processes
• Each process in the system records its local state and the state of its incoming channels
• Global state is collected in a distributed manner

The snapshot algorithm works using marker messages. The marker message is a special control message and not an application message. It does not interfere with application messages. The algorithm has three phases: Initiation, Propagation and Termination and works as follows:

## 1. Initiating a Snapshot

• Process Pi initiates the snapshot.
• The initiator, Pi records its own state and creates the marker message.
• Pi then forwards the marker message to all other processes using the using N-1 outbound channels Cij(j = 1 to N & j != i).
• Pi starts recording all incoming messages from channels Cji (j = 1 to N & j != i).

## 2. Propagating a Snapshot

If a process Pj receives a Marker message on an incoming channel Ckj, and if this is the first Marker that Pj is seeing:

• Pj records its own state and mark Ckj as empty.
• Pj then forwards the marker message to all other processes using the N-1 outbound channels.
• Pj starts recording all incoming messages from channels Clj for l not equal to j or k.

If the process Pj has already seen a Marker message:

• Mark the state of channel Ckj as all the messages that have arrived on it since the recording was turned on for Ckj.

## 3. Terminating a Snapshot

The algorithm terminates when:

• All processes have received a marker (which implies that process state has been recorded)
• All processes have received a marker on all the N-1 incoming channels (which implies that state of all channels has been recorded)
• Later, if needed, the central server collects all these partial states to build the global snapshot.

## Consistent Cut

A cut in a space-time diagram is a line joining an arbitrary point on each process line that slices the space‐time diagram into a PAST and a FUTURE. A consistent global state corresponds to a cut in which every message received in the PAST of the cut was sent in the PAST of that cut. Such a cut is known as a consistent cut. A consistent cut obeys causality and any run of the Chandy-Lamport Global Snapshot algorithm creates a consistent cut.

References:

--

--