Some observations about Clojure core.async

© Eric Johansson — http://www.erikjohanssonphoto.com/#/dreamwalking/

I’ve been using core.async for over three years now and I’ve continued to have insights into how to use it best, as well as some challenges.

Challenges

When things go well with core.async, they go very well. However, when things go wrong …

First off, when a failure occurs in a CSP (communicating sequential process: a go or thread block), you’ll see an uncaught exception … that often includes perhaps a single stack frame attributed to your code. So much is happening around your code, and in properly written CSPs, very little inside them.

Worse, a failure often leads to a zombie channel: not closed, but never to be written to again. Other CSPs that are parked or blocked on that channel simply never resume.

Clojure already has a challenge with respect to lazy evaluation: what code gets executed when is often all but impossible to predict, and is guided by what data is being referenced; the consumers of lazy sequences are setting the agenda here, not the creators of those lazy sequences. From the point of view of a typical, strict language, it can appear as if cause and effect are out of sync.

Adding CSPs and channels to this mix is throwing a bit of gasoline on the fire. With so much going on at any one time, and so directed by the specific data that is not represented in a stack trace, it becomes anyone’s guess what the cause of a failure really is.

Fortunately, in my experience, these things get resolved very early in the development process. You go directly from a frustrating “it’s just hung” situation, to one where all the CSPs just line up and quietly get everything done.

Document Your Channels

You really want to document what information passes through a channel. Even more than in traditional Clojure, it can be a challenge to quickly scan CSP code to see what the return value truly represents.

I’ve adopted the verb conveys for this purpose. E.g., “returns a channel that conveys a map of customer data” or “returns a channel that conveys a series of numeric data points.”

You especially want to be careful to distinguish between a channel that conveys a series of individual values, and a channel that conveys a single collection of values. Note the careful use of the term series rather than the somewhat ambiguous sequence.

Put Only Values

Here’s a general guideline: only pass values through a channel. Literally, only things that can go through an EDN print-and-parse cycle unchanged. Passing functions, channels, or mutable values such as Vars, Atoms, or Java objects (though sometimes unavoidable) will come back to haunt you.

This is essentially about containing mutability: a channel is already a special kind of mutable container, and it is never desirable to “nest” mutability in Clojure.

More practically, you may want to log, save, or queue the contents of a channel. Values are loggable (using pretty-printing), can be streamed to a file (as EDN), or even passed through an external message broker (again, as EDN).

Functions can be a challenge here: it may be desirable to pass a function through a channel, perhaps as a callback to some other code that may not be built in terms of core.async. Certainly, a Clojure function does not print well, and will not survive a print-and-parse cycle. Try to keep these kinds of uses as internal to a CSP or a subsystem, and not used between them.

Use Looping State

Often a CSP needs to keep some kind of internal state. Here’s a CSP that copies values from an input channel to an output channel, keeping count of the number of values transferred:

The CSP returns a channel that conveys the count of copied values.

This can be implemented more concisely, clearly, and efficiently as:

Let Consumers Define Channels

I’ve often been in a situation where a CSP operates on a feed of values from one or more channels, merges or otherwise manipulates those values, and puts them onto a new channel.

Too often, I’ve defined a function for this purpose following this outline:

The problem with this code is that the output channel is defined by the function. This prevents the consumer, the CSP that takes values from the channel, from setting a buffer size, or even using a lossy channel. Sure, the returned channel could be copied into such a channel, but that’s more code and more noise.

Instead, the function should be written to accept a consumer-defined output channel:

Again, clearer, simpler code. The consumer is in control over the size and characteristics of the channel that conveys the customer data.

A further side benefit is that the channel returned from the go-loop is no longer lost. This channel doesn’t convey a value but is closed after the last customer has been conveyed … and that can be useful information as well.

Closing Thoughts

There’s something almost magical about core.async; I’m somewhat infatuated with the idea of all those CPU cores firing up with useful work at all times, no thread wasted, and throughput effectively self-balancing. Likewise, channels solve some otherwise difficult issues with mutable state and in-process thread coordination. When core.async works, it is wonderful.

But, as with any powerful technology, you have to be careful to not go off the rails. The above recommendations may save you some trouble, and I’m sure I’ll continue to collect and share more tips and tricks.