The Problem Of Async Programming, And A Crazy Idea For Solving It

an image inducing the idea of breaking some basic rules of traditional programming in order to move forward to the next generation

The Problem

If you have done any non-trivial asynchronous programming recently, for example if you’ve done some nice backend development using NodeJS, you have quickly came face to face with a serious problem infesting the beloved programming paradigm: your codes quickly turn into hardly readable and much harder to debug tangles of code pieces (usually wrapped inside anonymous functions) that connect to each other in a specifically mind-boggling way. In other words, you most definitely have either generated or was forced to work on code infested with a callback hell:

Promise.all([
doA().then(aOut =>
new Promise(resolve => {
doC(aOut).then(cOut => {
resolve([aOut, cOut]);
});
})),
doB().then(bOut =>
new Promise(resolve => {
doD(bOut);
resolve(bOut);
}))
]).then(values => {
doE(values[0][1], values[0][0], values[1]);
});

Callback hells are a pretty easy to spot, yet they are rampant and often seem unavoidable. No one seems to be fond of callback hells, so it is safe to assume that no one would create a callback hell on purpose, which implies that they are merely a manifestation of an underlying cause. But if that assumption is true, what would such a cause be?

Well, callback hells do not infest good old synchronous code at all, so we can simply try to pinpoint their cause by taking a deeper look at the core difference between a sync and an async code:

let aOut = doA();
let bOut = doB();
let cOut = doC(aOut);
doD(bOut);
doE(cOut, aOut, bOut);

This is a synchronous counter-part for the aforementioned async code. It yields the same results, however, the key difference lies in its execution order (and subsequently performance):

in the sync code, doC() is executed after doA() and doB() are complete, although it just needed doA() to complete and not doB(). Similarly, doD() waits unnecessarily for doA() and doC() to finish as well, and doE() awaits doD() while it does not need to.

The reason for these undesired “waits” in the sync code is simple: sync code is executed sequentially. Statements in a sync piece of code follow each other as they are executed one after another exactly in the order they are written. If we were to describe our sync code in plain English, we would say something like “doA() then doB() then doC() then doD() then doE()”, and if we were to represent this flow, using a box representing each statement and an arrow representing each “then” word, we would get something like this:

Since our code-based representation is text-based, it inherently is equipped with an “order” for the statements, i.e. which statement is written in text after which one, so the “arrows” become implicit and redundant, and we omit them from our codes.

Similarly, our async code in English would be in lines of “doA() then doC(), simultaneously doB() and then doD(), and after doA(), doB() and doC() are done, doE()” which in terms of “boxes” and “arrows” would translate to:

Looking at this representation, the problem with our async code instantly becomes evident: There is an inherent order to the statements (simply because they are written down and each one have to follow another one) that in the case of the async code is no longer our execution order, so we need extra indicators (like Promise.then callbacks in the code or “then”-arrows in the figure above) to explicitly indicate that separate order. In other words, an essential aspect of our “code” (that it is an ordered sequence of statements) loses its meaning, and we need to explicitly communicate that meaning by other means, which as you can see, results in much extra complexity and hardship.

Note that it is not just that the order of statements is not aligned with the execution order, as if that was the case, a simple shuffling of our “statement” boxes would have solved the problem. The problem is that our execution flow is no longer a “sequence”, which means we cannot effectively represent it using any “sequence”, namely the sequence in which the statements are written down.

So does this mean that the problem is that async code is inherently more complex and so the callback hell problem can never be solved? By no means. Although async code is inherently more complex, it does not imply that we can never solve the callback hell problem. In fact, a lot of solutions have already been proposed and implemented in various languages and frameworks, so its worth taking a look at the most prominent ones to analyze if they truly tackle the problem.


Promises

Promises where the first solution introduced to Javascript community as a solution for the aforementioned problem. They were initially offered as part of some packages, and were so cherished by the community that got adopted as part of the programming language itself. Basically, async codes used to look like this:

doA((err, aOut) => {
// ...
}

This structure resulted in terribly messy codes, as a statement simply invoking an async function would include all of the callback chain, i.e. you could not single out function invocations without having to read through all of the flow. Additionally, there are many async flows that you simply cannot describe with this structure. For example, you can try to re-write our initial async code in terms of callback arguments and see how it works out.

However, Promises are not able to truly solve the problem as they are not even truly tackling it. Our initial example of a callback hell was not written in terms of callback arguments but actually in terms of Promises. Comparing to our graphs, Promises are basically a much neater way to represent the “arrows”, so they naturally cannot do anything to help with the complex tangle we have got on the graph itself.


Async/Await

async/await is the newest attempt to solve this problem (at least for Javascript). The way it works is pretty simple: await simply enforces the rest of the body of a function to “wait” the statement to be finished, with the catch that the whole function is now required to be marked with async , meaning that this is an asynchronous function.

In practice, this gives us extremely similar flow graphs as with the fully sync code, since just like the sync code the statements following await would need to “await” its execution. This was the very same case we had with the key differentiation between our async code and its sync counter-part. The only difference being that the whole context now becomes async, which allows the underlying engine to do other stuff in the mean time.

In other words, async/await is also not designed to tackle the problem, its just a tool to alleviate the problem when the execution flow itself is actually a “sequence”. So while this does help with simpler situations, again it would not be useful anywhere that real async flow is to be developed. To see that for yourself, try to re-write our async code snippet again using only async/await statements and see how does that pan out (EDIT: which is actually possible, thanks to “jmull” for a pretty neat solution for that).


The Crazy Idea

Ok since none of the aforementioned solutions are targeting the core of the problem with async code, lets get to the crazy idea that might solve it. Just to re-iterate, our problem was with an async flow graph being a tangle on its own (which naturally manifests in form of callback hell tangles, promise-then tangles, etc.):

Look closely at this problematic graph, and you will notice that the tangle of “arrow” connections is not an inherent problem of this graph, but actually a phenomenon emerging from our representation of choice. In other words, it is not that the graph we are trying to represent is inherently super complicated, its just the way that we insist on representing it makes it complicated. We can simply throw out the requirement for the “box”es to be drawn in a sequential manner, and we can re-arrange this graph like this:

Simple, isn’t it? :)

Well, you might be wondering by this point, that OK that looks like a simpler representation of our asynchronous execution flow, however it was the result of throwing the “sequentiality” of our representation out of the window, which is not something that you could do in practice, since, well, any code is a sequence of statements in the end. And you would be right.

And this is why it is a crazy idea: what if we didn’t code in form of a sequence of statements then, but actually in form of a graph like this one? i.e. what if we replaced sequential codes with graphical ones for inherently asynchronous codes?

Well, I asked myself that similar question about a year ago, and since the inherent problems of asynchronous programming, at least the way we do it currently, was constantly bugging me and slowing me down on a daily basis, I started slowly developing a prototype for that on late hours maybe one or two nights each week. Over the course of a year, and with the help and encouragement of some friends, that prototype grew into CONNECT, a platform that is built exactly just to solve the aforementioned problem, and basically looks like this:

Of course, CONNECT is not built as a platform for general-purpose asynchronous coding, but as one targeting specifically backend development, and even more specifically logical micro-services in the backend. Why backend? Because front-end async codes are much lighter on their asynchronicity, and also involve a lot of rendering logic, which makes the problem essentially a different one. Why logical? because again, computational backend code is generally not that asynchronous and treating it as such would actually slow you further down. And why micro-services? Well basically because they are the future of backend development.

Additionally, doing this we realized that backend logic is not purely asynchronous on all levels, and more often it looks like synchronous blocks connected to each other in an asynchronous manner. Fortunately, we could easily factor this into CONNECT, as it was already built on top of NodeJS/ExpressJS and did support Javascript expressions seamlessly within a graph. We simply allowed each node to become a complete synchronous block on its own.


Concerns, Pros & Cons

Concern #1: Visual Programming is not programming

I do realize that traditionally programmers (including myself) are against “visual” programming, as it has almost always been something targeting “less-knowledgable” people, and generally has always been a result of trading “fine-control” with “simplicity”, a trade-off that to put it mildly at least annoys any serious enough programmer.

However, that is not the case for CONNECT (or generally the graphic-based coding approach for async flows), as actually it offers more fine-control compared to callback arguments or async/await statements, and offers at least the same amount of control as Promises do. In fact, if anything, sucn solutions enable us to do much more with the asynchronous flows that we develop. This is not giving away control, but rather simply dropping a constraint that is basically holding us back, although that constraint might be a property of any code that we’ve done for a long time and so we might be really used to it.

Concern #2: The application of such a tool would be limited

Well, since this approach does not remove control from developers and merely offers a more optimal representation, it is not just the case that stuff that you could build with such a tool (such as CONNECT) would be more limited, but actually you would become able to build much more complex async logics much easier.

To actually ensure that, we even built the Platform as a Service of CONNECT using CONNECT itself. Not only did it not impede us by any means, but we were able to accomplish it in a surprisingly fast time-span (4 days to a week, from the moment that we registered the domain and decided the names of various involved micro-services until it was ready to launch CONNECT instances for any registered user, put them to sleep after minutes of non-activity and wake them back up in response to requests).

Of course, it is notable that since this is an approach to asynchronous programming. There are other contexts of programming that will definitely not benefit from such an approach at all, just the same way that they would not benefit from Promise s or async/await statements.

Pro #1: Efficiency

Since this approach is designed on making the flow representation much more efficient and hurdle-free, it should naturally result in lower cognitive load on reading/writing the said representation, which would result in much faster development time (and hence much lower development costs). In other words, writing or reading this graph

would obviously be much easier and hence much less time consuming than this one:

Pro #2: Maintainability

This one is a natural result of the former. As it becomes much easier to read and follow the flow of the code, it also becomes much easier to debug it or make changes to it, which is one of the main hurdles of our current approaches to async programming.

In fact, the process becomes so seamless that we realized we could actually easily record the execution of graphs in CONNECT and replay them in real-time, not just helping greatly with testing and debugging, but also with performance inspection:

Pro #3: Performance

Since this approach to async programming actually enables much finer control over your asynchronous flow, it avails much better orchestration of tasks that can be accomplished in parallel to reduce response time. While you can do this with traditional approaches as well, since managing any non-trivial asynchronous flow would increase costs of development and maintainability, in many cases you would compromise some performance to keep those costs low. However, this graph-based approach you are enabled to delve into such optimizations without fear of increasing development/maintenance costs.

Con #1: Async Flows are inherently different

Since one of the first projects with CONNECT was a Platform as a Service, we quickly stumbled upon this one: now that async flows that we would usually do have became much easy to manage, we found ourselves quickly building much more complex flows and graphs (that we would have definitely avoided otherwise as their text-based representations would be too much to handle), which were accordingly challenging for us to understand and work with.

This also came up in pretty early user-feedbacks we collected: Most people are accustomed to thinking synchronously of their codes and services, and for a good reason: in a synchronous context, you can know for sure exactly which statement is being executed at each step, a control which you naturally loose by going the async route. And while these kinds of challenges are usually hidden behind the headache of callback hells, when we resolved those headaches, they surfaced and we quickly realized that generally you would need a different mental approach to your more complex async graphs than what you would to your sequential codes.

However, as per our admittedly pretty limited tests, it does not take so long for people to adopt the newly required cognitive approach, although there definitely is a learning curve involved here.

Con #2: Tooling

For CONNECT, we tried to re-use as much of the already available tools and technologies available to us and just add the graph-based representation that we intended. It is based on NodeJS, which means all NPM packages are available for it (which is one of the largest package repositories available out there). We even created a human-friendly JSON-based format for the logical graphs, so that you could easily version-control them (for example using git).

Despite all this, a pretty early on feedback we got from one of our pilot users was that he was asking for removal of some of the graph data so that he could read the git-diffs in an easier manner. In this case, although the human-friendly format helps a lot, still reading a graph from a JSON-file is just as efficient as forcing an asynchronous flow into a sequential format, it will inevitably be more challenging than usual. Although we are planning on adding a lot of git-focused nice features (including pretty diffs) for CONNECT flow graphs, it remains a fact that you would loose availability of development tools that have been for years developed assuming they are to aid with text-based codes, and some of them would need to be developed from the scratch for graph-based codes.


Conclusion

Async programming is infested with code-tangles and callback hells, simply because being async means having a non-linear non-sequential execution flow, which is at odds with the sequential nature of text-based codes. The suggested solution here is to go for graph-based codes, in which you explicitly and optimally outline the execution flow which might as well be between blocks of synchronous codes.

Do you think this radical approach would work? Well there is no need to speculate as we have built a platform that just enables you to test it out for yourself. Give it a try, and share your thoughts and feedback.