Javascript Transducers 2: Stateful & Gateful

Creating more complex mapFilters with composable reducers

Drew Tipson
9 min readFeb 5, 2016

With the basics of transducers covered, it’s time to explore some of the neat things you can do with them.

There’s no need to be humble about it: we can now boil any ordered collection of data down into either a single value (or a smaller, filtered collection) or pipe each element through any complex composition chain of filters or transformations that we can think up. And we can do it by composing together appropriately tiny and independent parts!

In practice, what that means is that if you check out the lodash docs page, you can start to see that most of the methods listed under the Array and Collection sections could be implemented by using just our mapping or filtering transducers. That’s because nearly anything you can possibly do with an Array or Collection (deduping, partitioning, etc.) involves just combining some flavor of mapping and/or filtering combined with some basic core reduce operation (summing, concat-ing, flat-concat-ing, grouping things into an object, etc.)

To get a feel for how, let’s look at a particularly interesting subset: stateful transducers. Now, by “stateful” I don’t mean stateful in the sense that everyone is so hopped-up about in the context of a large javascript application. The states I mean here are contained entirely within the internal logic of a particular reduce operation.

Let’s say, for instance, that we’d like to filter out items in an Array not by testing their values, but instead simply by their positions in the list.

That is: we’d like to toss out the first 3 items.

Well, at least as we’ve designed them, none of our transducers have any idea what the index of the item they’re working on is. Nor do they have access to the full list. Some libraries do use reducing functions with 3 or 4 arguments: (acc, item, index, list) and we quite easily could build our reduce that way, but that’s too complex an architectural debate for our purposes here. And, as we’ll see, it’s not really necessary.

So how can we achieve an index-aware reducing function when it’s not being passed an index to work with? It’s not too hard: we can simply use a closure within our custom filtering function to count the number of items that have passed through it so far. There’s something very appropriate about that solution as well: the responsibility of keeping track of how many items have passed through a filter is placed squarely on the filtering component itself. It’s not baked into or shoehorned onto any other part of the process that doesn’t ordinarily care or need to know.

To achieve this effect, we can either just create a “dropGate” testing function with that “tally-counter” logic baked in and then pass it to filtering… or we can just write a new, dedicated transducer-producer (or, could we say: transformer?) for expressing the concept of dropping from scratch. Both amount to the same thing:

Pretty easy, right? The “state” here is simply a set number of “skips” that are re-used as a countdown, and “skipping” is simply returning the unaltered accumulator value. When the countdown is zero, it can then stop ignoring items start allowing them to end up in the final result (that is, start actually running the subsequent reducing functions that define how each item will modify the accumulator). No wasted work.

What about the opposite of dropping: taking (that is, taking only the first N items out of a list)?

Here we have a bit of a problem. If we want to TAKE the first n items, that means that we shouldn’t be wasting any time running anything afterwards: not only the remaining reducing functions in each iteration, but the iteration process itself. We need to break out reduce entirely.

So far, however, we’ve been leveraging the native Array.prototype.reduce method to do our reducing for us. And it simply doesn’t have a way to break out of a loop. Plus, we’ve been saying all along that anything Iterable should be reducible, in theory. Let’s make both wishes a reality by just building our own version of reduce.

How? You can probably guess the basic form already: in ES2015, Iterables can be looped over with for…of , so… we’ll simply be using that:

Again, realllllly simple stuff! And that’s great. Better yet, for…of loops can be stopped simply by using the “break” keyword.

Unnnnnfortunately for us, what we actually want is for our reducing functions to ultimately be in control of if and when the loop stops. And the only thing that changes on each loop through is the output of the reducing function… which could be anything from an Array, an Object, or a simple primitive value. How on earth are we going to reliably pass through a “break” signal through that sort of single-argument output channel when we need to pass through real data as well?

Now, I mentioned in the last article that there are some powerful, full featured transducer libraries out there: ones that implement everything we’ve been talking about here, but in an even more robust fashion. How do they pull this off? By adhering to particular transformer protocol that involves an entire extra layer of functional wrapping (as well as unwrapping) to pass around information about each step. This gives their transducers the capability to sort of send “meta-data” along with values. And one of those “meta” signals is simply the signal to stop looping.

But we simply don’t have that luxury: if one of our reducing function returns the number 5 to the simple reduce operation, there’s no flavor of “5” that can also encode a secret signal to stop. Are we out of luck?

Nope! But what we still need to solve this problem is a way to pass along information in a way such that it’s literally impossible for ANY possible calculation to come up with by accident.

It obviously can’t just be a particular number (0.53450006066), because then there would be a number, out of all possible numbers, that our reduce would break on unexpectedly and prematurely.

It can’t be a String for the same reason: no matter how long or randomly generated! Even complex types like Objects and Arrays (with the final value contained at a known index or key) wouldn’t really do by themselves: any set of Array items or object keys might someday, at some point, just happen to match the syntax we’ve defined as the stop command. It’s not likely… but ideally we’d make it impossible.

Well, as it happens, there’s now an native api for generating impossible values: ES2015's Symbols! Symbols are truly unique when generated: which means that they are equal only to themselves, ever. Unless you have some reference to one that’s been created, it’s flatly impossible for any computation to inadvertently generate THE one Symbol reference that we’d be using as a “break” signal.

Newly empowered, all we’ll need to do is figure out a way to give anyone writing a transducer a reference to what that stop signal will be. And since we’re having to write the reduce function ourselves such that it recognizes this stop/break Symbol, it probably makes sense for the function to just store the reference for that magic value on itself. And thus:

Now, when we want to send a final value and stop reducing all in one go, we can just send an object with a single Symbol key: [reduce.stopper]. The value of that key will be the final accumulated value. No reducing function could ever accidentally return a result that’s an object with that key on it. That particular key is only accessible by a very deliberate property lookup on the actual reduce function itself!

So with that system in place, we can now write a working transducer producer for “taking”:

Again, we’re using a simple countdown: if we’ve taken less than the correct number of items, we keep going, if not, return the final result wrapped in a reduce.stopper object. That way the final state of the accumulator can pass through to the final output, but nothing else will be attempted.

It might seem weird that I’m calling all these transducer-y things “X-ing,” but there is a logic to it. Imagine that we want to combine our dropping and taking filters:

Right? It’s almost a sentence: “reduce by concat-ing the items of this array, dropping 2 items, then taking 3.”

There’s also something to herald here about the fact that our positional filters keep track of things themselves: because of our short-circuiting, when the dropping gate is closed, we don’t even run taking… so the drops don’t count towards the number of items that taking will take. If we’d defined dropping and taking each purely by the objective, original array index, instead of by the relative, “view from the reducer” point of view… well then we could easily end up saying things like “drop the first 2 items and keep the first 3 items” which wouldn’t make a lot of sense. And we want to build things and name things in ways that make it easy for coders to make sense!

One final caveat on the way we’re managing “state” here: the simple “countdown” operation we’re employing has a single flaw: re-use. That is, once the dropping/taking operation is defined (that is, passed its allowed number of skips), it can only be used once. That means that doing var drop3 = compose(dropping(3),mapping(x=>x+1)) and then running reduce(drop3(concat),[],[1,2,3,4]) will work as expected the first time, but then give strange results the second time it’s used because the countdown within the closure will have already “completed.” There are many ways to deal with this, or to code dropping/taking to account for it. One way is to store and reset the count at the point in which the reducing function is passed through:

But of course, if you still wanted to create and store a complete reducing function like const drop2_concat = dropping(2)(concat) you’ll again end up with weird one-time behavior if you tried to use reduce(drop2_concat, [], [1,2,3,4]) repeatedly. You’ll have to decide how complex to make things to deal with the particular use cases you expect. IMO, it’s always best to store a composed transducer and run a core reducing function through it at runtime so that it’s clear what sort of operation is intended, but ymmv. Transducing libraries using the transformer protocol deal with this via defining a special “init” meta-step. We could do something similar by repeating our reduce.stopper trick, but let’s move on for now.

Anyhow, now that we’ve mastered dropping and taking, let’s think about making the this stateful “gate” concept even more complex. Let’s say that we want to have TWO functions working our filtering gate: one that determines whether to “open” a filter gate, and another that determines whether to “close” it again. Easy enough:

More complex than that? Ok: let’s consider a case where the “list” we’re working on is actually a sort of stream of data that’s somewhat noisy: maybe it represents frames from a security camera where each frame has measure of “motion” detected. When the motion detection measure goes above a certain level, we want to start including all frames in the final output.

But because we want smooth streams of images we don’t want to just throw out any interim frame that has little to no motion detected: we want to stop “listening” to the frames only after the motion value has dropped below the threshold and stayed there for a particular number of frames. What we really want in this case is for our filtering transducer to be able to recognize patterns in our noisy data. Here’s one such pattern:

The data we’d be working on would probably be a Collection (an Array of objects) that might look something like this:

[{frame: FRAMEDATA, motion: 1},{frame: FRAMEDATA, motion: 3},{frame: FRAMEDATA, motion: 4}, … etc]

That means that the specifics of all our various reducing functions would have to handle extracting particular bits of each item (in this case, the motion property). But, just to keep things short & simple so we can see how this sort of filter works, work over an Array of JUST the motion values:

[1,3,4,2,7,8,9,10,2,2,1,1,0,0,0,0,0,7,3,4,2,2,1,6]

Now let’s try to define the pattern we want as “when the motion value goes above 2, include all frames in the result until they drop to 2 or below for more than 3 frames.”

Aside: it’s worth noting that when used, noiseFiltering has a very confusing signature here: it takes a filter function (for testing a value) and then also an integer value (for number of fails allowed before the filter switches off). If it were curried, however, we could have named each step intelligibly if we wanted to: another potential advantage of curried implementations.

Right? Anything over a 2 starts allowing frames through into the final output, but only consistently staying at 2 or below will close the gate again. There are all sorts of smoothing algorithms we could define to get the output we want (for instance, requiring a certain number of high-values before opening the gate as well). All we have to do is keep track of just the bits of reducing history that matter to recognizing the pattern: no more, no less.

The closing point here is simply that when we can easily separate out the pattern definition into it’s own simple, single purpose component, we can reason more clearly about exactly what it is and should do.

And, hopefully, after all this rambling, you’ll agree that transducers are way easier to understand and implement than you probably imagined and that even our very parsimonious implementation proves its worth it in terms of both efficient code (fewer loops!) and intelligibility (simpler, purer components).

--

--

Drew Tipson

Primarily Javascript, potentially personal, possibly pointless. I welcome and am fascinated by your many marvelous opinions.