Understanding the Concept of Virtual Time Using the Time Warp Algorithm
What is virtual time and why do we need it?
As distributed systems have progressed and been adopted over the last decade, there have been numerous technologies in different segments like databases, caches, message queues, etc which are built on top of other frameworks which abstract away the difficulty of managing distributed systems. One of the most important and difficult things to manage in distributed systems is managing synchronicity using time.
Some common forms of synchronization techniques in distributed systems are block-resume, abortion-retry, lookahead-rollback. This post will cover the lookahead-rollback used by the Time Warp algorithm since, though not intuitive, it is leads to elegant and efficient solutions when the cons are compared for each of the techniques.
A virtual time system is a distributed system executing in coordination with an imaginary virtual clock that ticks virtual time. Virtual time itself is a global, one-dimensional, temporal coordinate system imposed on a distributed computation; it is used to measure computational progress and to define synchronization.
The main purpose is to have a single virtual time (which may also be in sync with the real time) across the system so that processes can always operate in an opaque manner when in actuality, it is an unpredictable entity.
All processes communicate with each other (either locally or remotely) via messages which mainly consist of 4 primary fields: sender, virtual send time, receiver, virtual receive time.
Some fundamental rules which must be observed for virtual time:
The virtual send time of each message must be less than its virtual receive time.
The virtual time of each event in a process must be less than the virtual time of the next event at that process.
It should also be noted that the virtual times of any event A and B should follow the above rules only if there are events which directly or indirectly have causality amongst A and B.
Lamport’s logical Clock
Lamport was one of the first to show that real time temporal order, causality between events had a strong connection to the concepts of relativity. He provides an algorithm which assigns ordered clock values to events once the execution of a distributed system starts. The Time Warp algorithm is an inverse of the Lamport alogrithm where they mainly have the time assigned to the event and rollback if any discrepancy is found
Reed came up with the concept of pseudotime which seems similar to virtual time but is different in the sense that pseudotime where events are assigned multi-version timestamps which are used for concurrency control to understand the atomic time of the event in distributed systems where virtual time is more relative in nature such that it mainly deals with events which have happened before or after in comparison to others. Reed uses abortion-retry for his algorithm where there may be starvation, unlimited retries, deadlocks, etc.
Schneider’s algorithm mainly consists of broadcasting also synchronized messages to all processes and not proceeding till acknowledgements are received from each of them. Keeping all synchronized events in their local memory, processes are able to make decisions locally about the order of events. However, the algorithm doesn’t scale with the requirements of today’s scenario where the broadcast messages and acknowledgements needs to be performed for tens of thousands of servers for millions of messages.
Time Warp algorithm
Each process maintains its own virtual clock which is changed only between events. Each process has 3 queues; the input message queue, output message queue and the state queue. Keeping in mind the virtual time rules mentioned in the previous section, the each event in the local queue of the process should always be in increasing order.
There are extremely probable cases where events with virtual time lesser than the current time in the process may arrive, which violates the rules of virtual time. In such cases, the time warp algorithm follows the lookahead-rollback algorithm, which results in rolling back to the point where the incoming message fits appropriately with the processes’ virtual time and remaining messages have to be replayed in the correct sequence.
The rollback of messages cannot be narrowed down to a single processor alone since the messages sent in the wrong sequence also affects other processes which maintain their own queues. Hence, there has to be a rollback for the error messages in other processes as well which are connected to each other. Other processes which don’t have any direct or indirect connection to the message with error sequence won’t be affected, thus leading to a lesser network rework effect.
For rollbacks, there is a concept of antimessages in the system which can also be used for other purposes. All fields of messages and antimessages are the same except for one field, which is the sign of the message. All messages which have been sent to other processes have a (+) sign and their antimessages have a (-) sign. Whenever a message is sent, the message is stored in the receiver’s input queue and the antimessage is stored in the sender’s output queue.
Whenever a message and antimessage exist in the same queue, both the messages cancel each other out and are therefore removed.
The messages and antimessages are created together and can exist in different queues.
Coming back to the rollback case where we had to rollback the sent messages to other processes to maintain the new correct order, the process generates and sends antimessages for all the incorrect messages sent to other processes which do the same locally as well, thus leading to ripple effect of efficient rollbacks. Even in cases where the antimessage arrives before the actual error message in another process, both the messages will be annihilated since the same queue cannot have both the messages ultimately bringing all processes to the correct order.
As we go through the above mechanism, we need to understand that all the steps listed above are happening in the local context of a process, which doesn’t have knowledge of a global virtual clock yet. Not having a global value doesn’t allow efficient memory management of the queues since we see that messages have to be kept in their queues indefinitely which can be a problem at scale.
We introduce the concent of Global Virtual Time (GVT), which at real time r, is the minimum of the following
- All virtual times in all virtual clocks at r
- Virtual send times which have been sent but have not been processed/received at time r.
From the above definition, it becomes evident that at any point in time in entire system, the GVT will also be the lowest value or the floor value in the system. The GVT symbolizes that the messages below the GVT have been processed in the correct order and can be forgotten about, which the local processes can use to clean up their queues.
The delay in calculating the GVT is the total delay in sending a broadcast message to the processes.
A common feedback of the Time Warp algorithm contains the argument that rolling back across thousands of processes may not be feasible in the real life use cases. However, there is a point to be made that the rollbacks are termed as exceptions in real life use cases and not the norm. Since processes folllow the temporal locality principle, it makes more sense that events would arrive in the actual order and the events which arrive in the past arrive in the recent past which means the rollbacks are lesser.
The only alternative to lookahead/rollback is for the process to be blocked (i.e., doing nothing) for the same length of real time as the lookahead computation, which is just as much of a “waste.”
Virtual time is strongly analogous to Virtual Memory, the same concept of memory management where the most optimal pages are kept in the main memory. There are numerous efficient algorithms which try to determine the pages which are kept in the main memory, where lookahead section of data files prefetch a few pages in blocks and keep as it’s the usual trend of the user. In cases where a required page is not found in the main memory, there is a page fault which swaps out the least optimal pages from main memory and replaces it with the required page which is again similar to the rollback mechanism. The analogy can be extended in various ways to suggest that virtual time can be implemented in an elegant and efficient manner in distributed systems.
I hope you liked the article. Please let me know if you have any queries regarding the article. Happy reading!!