Callbacks vs Coroutines

A look at callbacks vs generators vs coroutines

TJ Holowaychuk

--

There’s been a lot of arguing lately regarding a somewhat recent Google V8 patch providing the ES6 generators, sparked by “A Study on Solving Callbacks with JavaScript Generators”. While generators still sit behind the —harmony or —harmony-generators flags it’s enough to get your feet wet! In this post I want to go through my experiences with coroutines and why I personally think they’re a great tool.

Callbacks vs generators

Before getting into the difference between generators and coroutines let’s look what makes generators useful in environments like Node.js or the browser where callbacks dominate the ecosystem.

First off generators are complementary to callbacks, some form of callback is required to “feed” the generators. These “futures”, “thunks”, or “promises” — whatever you prefer to call them allow deferred execution of some logic, this is what you yield a value and allow the generator to handle the rest.

Once one of these values is yielded to the caller, the caller waits around for the callback and then resumes the generator. Depending on how you look at this, generators used in this way are effectively the same mechanism as a callback, however with some added benefits that we’ll look at soon.

If you’re still not quite sure how generators come into the picture, here’s a simple implementation of a flow control library built on top of this facility:

var fs = require(‘fs’);function thread(fn) {
var gen = fn();
function next(err, res) {
var ret = gen.next(res);
if (ret.done) return;
ret.value(next);
}

next();
}
thread(function *(){
var a = yield read(‘Readme.md’);
var b = yield read(‘package.json’);
console.log(a);
console.log(b);
});
function read(path) {
return function(done){
fs.readFile(path, ‘utf8', done);
}
}

Why coroutines make code more robust

Contrasting the typical browser or Node.js environment, coroutines run each “light-weight thread” with its own stack. The implementations of these threads varies but typically they have a relatively small initial stack size (~4kb), growing when required.

Why is this so great? Error handling! If you’ve worked with Node.js where exceptions are quite prevelant even compared to the browser, you’ll know that error handling is no simple feat. Sometimes you get multiple callbacks which have undefined side-effects, or forget a callback all-together and fail to properly handle or report the exception. Or perhaps you forgot to listen for an “error” event, in which case it becomes an uncaught exception and brings down the process.

Some people are happy with this process and that’s fine, however as someone who has used Node.js since its infancy, it’s my opinion there is much more that can be done to improve this flow. Node.js is great in so many other ways but this its achilles heel.

Let’s take a simple example of reading and writing to and from the same file with callbacks:

function read(path, fn) {
fs.readFile(path, ‘utf8', fn);
}
function write(path, str, fn) {
fs.writeFile(path, str, fn);
}
function readAndWrite(fn) {
read(‘Readme.md’, function(err, str){
if (err) return fn(err);
str = str.replace(‘Something’, ‘Else’);
write(‘Readme.md’, str, fn);
});
}

You might think this isn’t so bad, you see code like this all the time! Well it’s broken :) Why? Because most core node functions, and most third-party libraries in the wild do not try/catch their callback invocations.

The following code will throw an uncaughtException and there is no way to catch it. Even if core were to delegate this exception back to the callee this is potentially error-prone, as multiple callbacks have undefined behaviours.

function readAndWrite(fn) {
read(‘Readme.md’, function(err, str){
throw new Error(‘oh no, reference error etc’);
if (err) return fn(err);
str = str.replace(‘Something’, ‘Else’);
write(‘Readme.md’, str, fn);
});
}

So how do generators improve this? The following snippet introduces the same logic using generators and the Co library. You might think “this is just stupid syntax sugar” - but you would be incorrect. Since we pass the generator to the `co()` function, and all yields delegate to the caller, extremely robust error handling can be delegated to Co.

co(function *(){
var str = yield read(‘Readme.md’)
str = str.replace(‘Something’, ‘Else’)
yield write(‘Readme.md’, str)
})

Libraries like Co can “throw” exceptions back to their origin as shown below, meaning you can use try/catch as the language intended, or leave them out and utilize the final Co callback to handle the error.

co(function *(){
try {
var str = yield read(‘Readme.md’)
} catch (err) {
// whatever
}
str = str.replace(‘Something’, ‘Else’)
yield write(‘Readme.md’, str)
})

At the time of writing Co seems to be the only one which implements robust error handling, but if you look at the Co source you’ll notice all of the try/catch blocks. If you do not use generators you effectively have to inline all of these try/catch blocks into every library you’ve ever written to make it truly robust. This is what makes writing robust code with Node.js almost impossible as it is today.

Generators vs coroutines

Generators are sometimes referred to as “semicoroutines”, a more limited form of coroutine that may only yield to its caller. This makes the use of generators more explicit than coroutines, as only yielded values may suspend the “thread”.

Coroutines are more flexible in this respect, and looks just like regular blocking code as no yield is required:

 var str = read(‘Readme.md’)
str = str.replace(‘Something’, ‘Else’)
write(‘Readme.md’, str)
console.log(‘all done!’)

Some people view full coroutines as “dangerous”, as it’s unclear which function does or does not suspend the thread. Personally I find this argument silly, most functions that suspend are pretty obvious, things like reading and writing to files or sockets, http requests, sleeps, and so on will not surprise anyone when they suspend execution of that thread.

If this is undesirable then you “fork” off and force the task to become asynchronous, much like you do in Go.

In my opinion generators are potentially more dangerous than coroutines (while still less so than callbacks) — as simply forgetting a yield expression could leave you puzzled or cause undefined behaviour when it executes after the rest of your code. Either way both semi— and full coroutines have different pros and cons, I’m happy that we at least have one.

Let’s look at how you can employ this new construct using generators.

Simplifying async flow control with coroutines

You’ve already seen how a simple read / write operation looks more elegant than the callback variant, but let’s look at some more patterns.

Assuming all operations execute sequentially by default simplifies the mental model, and while some people claim generators or coroutines complicate state, this is misinformation, working with state is identical to that of callbacks. Global variables remain global, local variables remain local, and closures remain closures.

To illustate flow let’s say as an example you wanted to request a web page’s html, parse the links, then request all those bodies in parallel and output their content-types.

Here’s how that might look using regular callbacks, without the use of any third-party flow control library.

function showTypes(fn) {
get(‘http://cloudup.com’, function(err, res){
if (err) return fn(err);
var done;
var urls = links(res.text);
var pending = urls.length;
var results = new Array(pending);
urls.forEach(function(url, i){
get(url, function(err, res){
if (done) return;
if (err) return done = true, fn(err);
results[i] = res.header[‘content-type’];
—pending || fn(null, results);
});
});
});
}

showTypes(function(err, types){
if (err) throw err;
console.log(types);
});

Such a simple task quickly loses all meaning with callbacks. After adding in error delegation, double-callback prevention, storing of the results, and the callbacks themselves you really don’t even get an idea of what the intent of this function is. Oh and if you wanted to make this more robust you’d have to try/catch the final fn(null, results) as well.

Now here is the same showTypes() function implemented with generators. As you can see the resulting signature is identical to that of the callback implementation, both of these concepts may co-exist. In this example Co handles all of the mundane error handling and result stuffing that we had to do manually above. The array yielded via urls.map(get) is executed in parallel, however responses retain order.

function header(field) {
return function(res){
return res.headers[field]
}
}
function showTypes(fn) {
co(function *(){
var res = yield get(‘http://cloudup.com’)
var responses = yield links(res.text).map(get)
return responses.map(header(‘content-type’))
})(fn)
}

I’m not suggesting every npm module should start using generators and forcing dependencies like Co on people, I would still suggest the opposite — but at the application level I highly recommend it.

I hope this helps illustrate that coroutines are a powerful tool in the non-blocking work of programming.

--

--