Acing Asynchrony

Published in

CodeX

10 min readJul 15, 2023

In Werner Vogels’ 2022 AWS re:Invent keynote, he used a Matrix parody to demonstrate how the world is asynchronous and that synchrony is an illusion.

This is a recurring topic in software development. Such as Gregor Hohpe’s classic article “Your Coffee Shop Doesn’t Use Two-Phase Commit”.

Your coffee shop doesn't use two-phase commit [asynchronous messaging architecture]

The real world is often asynchronous. Our daily lives consist of many coordinated but asynchronous interactions. This…

ieeexplore.ieee.org

So what is this all about? Why is our world asynchronous and what does asynchrony really mean? And why do we persist with this illusion of synchrony?

The Meaning of Asynchrony

The Greek and Latin origin of the word asynchronous is not happening at the same time. Another definition is something that is not synchronised. Now in some contexts, this is incredibly simple to explain.

A brass band plays in synchrony. A flock of birds moves in synchrony. Or take Queen at Live Aid. Where everyone is singing and clapping in unison. This is synchrony.

Asynchrony is the opposite. Clapping out of sync. Singing the wrong verse in the song. Birds that collide. Or fly and disperse across the sky.

In computing, there are a number of uses of the term asynchrony. Such as asynchronous processing or tasks, asynchronous programming (e.g. NodeJS), and asynchronous communication. All of which is a source of confusion.

Let’s clarify this by analysing what synchronisation they achieve. Using our Queen analogy, which two or more things do they cause alignment on at a point in time?

Let’s start with something we have grown to harness more and more.

Time

Tick! Tock! Rhythm and regularity. Central to the development of our civilisations has been our use of time.

The year – our orbit of the sun. The day – the revolution of the earth. From the sensing of the cycle of the seasons. To regulating our activities around the rising and setting of the sun.

The first measurement of the day in ancient times with the development of sundials. Mechanical clocks in medieval times to track the hours for daily prayers. And of course the use of bells to signal those passing hours.

Then precision timekeeping with pendulums.

Followed by more widespread adoption in the Industrial Age.

Coordinating the masses that come to work in factories. Where workers were faced with strict fines for lateness. So hired “knocker uppers” who would wake them up in time for their work shift.

The growth of railways and the need to schedule. To avoid accidents. Get people where they needed, when they needed.

Military use of the wristwatch in the First World War. To coordinate artillery with mass infantry assaults.

And now in the era of the global information age. The humble alarm clock. The Alexa timer. The Outlook meeting reminder. And the myriad of push notifications on electronic devices throughout our homes. Which constantly interrupts us and causes continual distraction. At the heart of each and every one of these devices is a computer which relies on its clock.

Time has become more critical as our world has become more complex. As we have become more interdependent on each other. Creating a growing need to coordinate, prioritise and schedule things that we do relative to each other. To synchronise our activities. But if synchronising is so important, why all the fuss over asynchrony?

The Illusion of Synchrony

One of the fundamental underpinnings of the way our world works and our ability to understand it is causality. Cause and effect. Basically, certain things need to happen before others if we want things to function.

Think of performing a simple task. Something we reduce to a set of steps. A process or a sequence if you will.

Take my breakfast routine. I get out a bowl. Then the porridge. The milk. I measure the porridge into the bowl. Then the milk. Heat it in the microwave. Now I wait. Finally, I can remove the heated porridge from the microwave. Get my spoon and start eating.

All very simple.

Or is it?

Clearly, I needed the porridge in the bowl before I heated it in the microwave. And I will not be happy eating breakfast which is just an empty bowl. So there are some causal links in order for breakfast to function. But I could have gotten the porridge out before the bowl. I could have put the milk in the bowl before the porridge. Or started it all off by getting out the spoon. So there is the potential of many subtly different sequences that would all result in the same outcome.

Another point is that there is a difference between us thinking of a task versus executing that task in the real world. When we think of a task our brains work with limited working memory. So in order to limit the cognitive load it is easiest to think of a task as isolated and as a very simple sequence. Do step 1. Think of nothing else. Then step 2. Think of nothing else. And so on.

But by exposing the task to the real world, it is no longer isolated. Taking my breakfast example whilst my porridge heats in the microwave, I may read the news on my phone. We have started another task. Then halfway through reading the news I am interrupted by the microwave pinging to let me know my porridge is ready. Then as I start to eat my breakfast, my daughter appears and demands she wants porridge too. Another competing task! Our breakfast task is still a sequence relative to itself. But it is interwoven unpredictably through interruptions with other tasks running at the same time.

When you first learned to code it was probably basic logic. A small block of sequentially executed code.

But then you extended to I/O. A time-consuming write to a database. An HTTP request to an external system. Waiting for a user to press a button on the UI. Now you encounter multitasking. Interprocess communications. Or distributed systems. It becomes complicated and unpredictable.

Or maybe you started with Javascript and asynchronous programming. But then found callbacks difficult to follow. So now you use promises and async-await so that it reads synchronously again.

We will always be drawn to sequence because of its concise representation of causal links. Its simplicity in defining one true way. This leads us to the illusion of synchrony. Thinking of anything which is remotely complex in a pre-determined non-interruptible order. Where order comes for free.

The real world is a lot messier than that. We cannot abstract away the idiosyncrasies of asynchrony. We cannot always block. We often need to prioritise. We must accept this means we need to think constantly about how to order and synchronise. To maintain the causal links of our tasks so they serve the function they intended.

Asynchronous Processing or Tasks

The first confusion about asynchrony is that it is something that it is not. So first we must understand synchrony.

In the context of asynchronous processing or tasks, synchrony means that the specific actions of a task are executed immediately after each other from start to finish without interruption. There is continuity to the execution.

Asynchrony in contrast has interruptions. To avoid blocking or allow reprioritisation. So it does not have the same continuity in time.

So what multiple things does it synchronise at the same time?

A task is a sequence of instructions so there is no intention to synchronise any of these — they need to occur successively. There is definitely coordination but no obvious synchronisation. Pedantically we could say it does not quite meet the definition.

Nevertheless, this is the synchrony that Werner Vogels parodies.

In asynchrony, tasks stop-start. They recommence as a consequence of an event. We are notified of this event. This is the event-driven system.

Where the causality of the task is not represented as a simple sequence. But instead by always being in a current context. Where the incoming event causes an effect, the task evolves and potentially transitions to the next step. The state machine.

This is more representative of the reality we live in. We are not a closed system. We are open to the wider world. We constantly interact with it.

I can fulfil the roles of father, brother, son, friend, etc. I fulfill each of these roles discontinuously depending on who I interact with. We are a scarce resource —constrained by time. So in order to deal with our complex world we must multitask, interrupt, and prioritise.

This is why the world is asynchronous.

But that does not mean it is easy. It means constantly context-switching between tasks. A continuous inefficiency. It means managing tasks that are in flight. Ensuring that interrupted tasks are restarted. This creates challenges in designing software systems. But it also creates challenges for managing ourselves.

Take traditional project management. Where a project manager wields an action list with assignees and expected dates. With regular status meetings. Synchronising the asynchronous.

Or Lean/Agile. Where WIP or in-flight tasks are reduced. Because it realises investment earlier. Reduces management cognitive load, context switching and handoff delays. So remove the sources of interruptions. Make external dependencies internal by structuring for a cross-functional stream-aligned team. Alleviate priority interrupts by proactively prioritising. Eliminate a competing failure demand work stream by building in quality. To all intent and purposes aspiring for synchrony of a sequence of small focussed tasks.

This highlights a trade-off. Synchrony focuses on the task. Because the task cannot be interrupted, the elapsed time to complete the task is optimised. Asynchrony focuses on the resource. Because we never block resources, resource utilisation is optimised.

The right choice depends on context.

We can favour the utilisation of critical compute resources by performing asynchronous I/O.

Or we can question the logic of pull requests for code review which are clearly asynchronous. They may appear convenient as they allow developers to keep busy. But do they optimise the delivery? Would a synchronous alternative better serve the greater good? Such as pair programming or simply a synchronous commit.

The illusion of synchrony is not that synchrony is always bad. But we overuse it when our world is asynchronous. Both have their pros and cons.

Asynchronous Communications

Another use of asynchrony in computing is with communications.

So what multiple things does it synchronise at the same time?

In the classic telecommunications definition, synchronous comms would include sending a clock signal which would then allow the transmitter and receiver to synchronise rate in relation to a common clock.

But in modern software development, it is less about synchronisation and more about coordinating the request and response to happen contiguously.

So no synchronisation again? Not unless we look past that exchange. Then we find a very important synchronisation.

State synchronisation

In the context of a task, communication is a way for multiple entities to synchronise their shared understanding of the progress of the task and its current state.

Back to Queen at Live Aid.

The key to their ability to synchronise with the crowd is their proximity. They are all within earshot. That instantaneous and continuous feedback allows them to clap in rhythm and sing along to the song.

This is a form of synchronous communication. Like a face-to-face conversation. Or a mob programming software team.

So is a FaceTime call synchronous? We certainly class it as such. Ever FaceTimed someone on the other side of the planet? There is a lag. That lag can lead to multiple strands of conversation and confusion. This brings us back to proximity.

To synchronise in the purest sense we must consider not just time. But space-time. In Einstein’s theory of Special Relativity, this is called Relativity of Simultaneity. Because light takes a finite time to cover a distance in space, it is not possible to define simultaneity with respect to a universal clock shared by all observers.

We see echoes of this in the CAP theorem. If you do not partition, you can be available and consistent (i.e., you are a single task scheduler and get state synchrony by default). If you partition, you can either be always available, but not always consistent (as the distribution creates a lag in consistency — you are eventually consistent). Alternatively, you can be always consistent, but not always available (because amongst the distribution you are waiting on synchronising state consistency between schedulers).

So the ultimate proximity is for a task to be scheduled and executed by a single entity. In this case, state synchronisation comes for free.

The next best thing is to be close enough in space so that you can synchronise in near real-time. Just like Queen and the crowd at Live Aid.

In reality, a dynamic environment can never perfectly be in sync. A bit like a gardener saying there will never be a weed in their garden. Even at Live Aid, as good as it sounds, there would be more than one person out there singing and clapping completely out of sync. Rhythm does not come to us all.

Software solutions are no different. Partly because we utilise multiple processes to solve problems.

But also because we represent pieces of data in multiple forms. Working memory versus persistence. Caching. Replication.

We expect some level of asynchrony and data inconsistency. The split-brain issue.

At certain levels, you can design to eliminate it. Such as the DDD aggregate. Where you design for proximity (i.e. cohesion) by identifying a consistency boundary.

But in many cases, it is more about recognising that perfect is not possible. Inconsistency will happen. What are the potential consequences? How can we mitigate this? Ensuring idempotence. Implementing compensating actions.

Take supermarkets. When you shop in-store, the customer and product stock are synchronised because the customer physically puts the item in their basket. But when you order online, it could be scheduled a couple of weeks in the future. Stock is incredibly dynamic. So instead of aiming for the unrealistic goal of guaranteed delivery of everything, mitigate with a substitution policy.

With asynchronous communication, you are assuming eventual consistency. Such as with pub/sub messaging.

Or looking at how we as people communicate. Email, Slack, or even post.

In real life, I could agree to meet a friend at our usual meeting place. But then it is raining so I message them to meet elsewhere. Until that message is delivered and read, they may be sitting around waiting in the rain. This is the danger of eventual consistency.

Recognising that inconsistency worsens over time, it is often about assessing the risk to decide on the cadence of synchronisation.

Such as time-to-live in cache invalidation.

Or in software development the daily standup.

Or the tradeoff between excessive branching and continuous integration. We do like to keep this illusion of being in our own little worlds. But in reality, there is only one. One we all need to synchronise upon.

Summary

The world is asynchronous and if we try to frame it all as synchronous then that is an illusion.

That is not to say we should not use synchrony where we can. For it focuses on achieving things quicker, with better coordination and a more intuitive way for us to understand.

But ultimately in a world of scarce resources and complex demand, we often need to accept asynchrony. Think asynchronously. Despite the confusion, inefficiency and inconsistency it creates.