Wary Functor & the Freedom From Failure

Mischief Managed

Drew Tipson
13 min readJun 16, 2016

It’s time to tackle a functional fundamental: Functors.

Back when we were making some Monads, I glibly asserted that Arrays are Functors, and told you not to worry about it. Well, it’s time to worry: why do we know that Arrays are Functors? Because they’re a “Type” that implements a legitimate .map() method of course!

Ok: that isn’t a particularly helpful description, especially in javascript. After all, anyone can create a method on any dang object and just name it “.map” regardless of what crazy thing it does. Likewise, anyone can define a perfectly legitimate Functor method and name it something wacky like “.then()” instead of .map(). So the literal string property name of the method is not actually important. What is?

We have to get specific: what makes Array.prototype.map a legit Functor implementation, and thus makes Arrays legitimate Functors, is that they satisfy 2 simple laws:

1. Something.map( x => x ) == the exact same Something

2. Something.map(f).map(g) == Something.map( x => g(f(x)) )

If that syntax is confusing, just think of “Something” as an Array that only contains a single item (though, remember: we’re trying to define something that’s much larger and more generalized than Arrays).

So what do these laws mean? Let’s delve into the first one:

Something.map( x => x ) == the exact same Something

Here, the bog-standard Identity function, x=>x (also known as the I Combinator), doesn’t actually do anything interesting. And that’s sort of the point. It just takes the value inside our Something and returns it back, totally unaltered, still inside a Something. You know the saying “garbage in, garbage out”? The I Combinator just generalizes that idea: X in, X out.

And yet, by doing so, by doing nothing of note, it immediately establishes that a .map() operation, no matter what other nutty thing it might do or not do with its callback function in the meantime, MUST ultimately return the same Type (well, the same Type, and in the case of Identity here, even the same inner value, unaltered).

Likewise, if nothing ever happens to the inner x value when you map using the I “echo that back” Combinator, that also establishes that .map() itself isn’t really permitted to do anything transformative or weird with the value inside of the Functor: all the value-transformation-y bits have to be done by the particular callback function that map is passed.

So what’s a version of map that can pass the 1st law? Simple: it’s just a matter of applying the function to Something’s value and making sure it returns the Something Type:

So, there we have it, the first Functor law. Stupid simple. Thing is, though, you only really learn the value of laws when you try to break them, so let’s try to do that now! We already know what a “right” implementation looks like, so let’s try to create implementations that will…

  1. trivially obey the 1st law, but
  2. not mirror that super-basic implementation, do something else obviously crazy
  3. prevent .map() itself from being the cause of any TypeErrors: it’s fine if the execution of the function itself causes them: that’s its problem. But .map()’s implementation shouldn’t, in any way, rely on or be vulnerable to the type of value it’s working with (implying a principle called “parametric polymorphism”)

Ug. It’s hard being a reprobate. Can we ever beat the system? Is there really only one right way to implement a Functor? …well wait, I’ve just thought of one totally devious trick:

Right? With something that silly, using x=>x as the function would only trivially return the same Type and value because every supplied callback function …every imaginable function would do so. After all, the dang function isn’t executed or even mentioned at all!

This is obviously a huge problem, right? Perhaps the Second Law will remedy this absurd loophole we’ve discovered?

Nope! In fact, there is a completely legitimate Functor that works exactly like this: Constant (let’s call it Const for short):

What’s the point of having a Functor with a .map() method that doesn’t actually do anything other than asexually clone itself, impervious to any attempts to alter its contents?

Well, if you worked through the Lenses article, you’ll see exactly why it can be useful: it’s a fantastic way to conditionally wrap particular values in impenetrable armor when passing them into complex higher-order functions. If that higher-order operation takes a Functor and then works by delegating to calling the .map() method of whatever Functor it’s passed… Const can neatly bypass whatever else it does without actually breaking anything. Nothing happens and we just get the Const Functor back. Like astronauts sent into space inside a well-shielded capsule, we can open up the capsule when they returns and be 100% assured of finding them unharmed.

If, however want to pass a value into that very same operation and have the .map() operation actually do something, we could instead wrap the value in an Identity Functor (a rather different thing from the Identity Combinator that we used to test map). The Identity Functor is a lot like an Array, except that it can only contain a single value. It basically reproduces the exact implementation of Something.map() we spelled out originally:

Now empowered with our badass Identity Functor wrapper (simplified down to just what we need for example purposes), we can send our astronauts into space totally unshielded, deliberately exposing them to radiation and, ug, this metaphor got away from me. Awful. Condolences.

Anyhow, the point is: having these two different flavors of Functors at our disposal was exactly the trick that made Lenses capable of shepherding primitive values through a delegated mapping operation (primitive values, remember, can’t sensibly implement methods, let alone a .map() method, so they had to get wrapped in something that could).

When we wanted the Lens “getter” (retrieval) operation to return something that would ignore the Lens “setter” (mutation) operation, we wrapped a value inside the Const Functor. And when we wanted the result of the Lens getter to be exposed for mutation, we just wrapped them inside the Identity Functor. Once everything was done, we just unwrapped the values again. It’s conditional Functor wrapping as a logic gate. Ingenious.

Stepping back from Lenses specifically, this also highlights what’s so useful about having all Functors act the same in certain basic ways: it means that we can create generic operations that implement a delegated .map() operation without those operations having to know anything about how any particular Functor-flavor of .map() actually works. As long as we know that it will always return something of not only the same basic outer type, but also something of the same basic structure (e.g. arrays never change length while being mapped) we can abstract out larger behaviors. And we can just assume that things will work in a predictable way.

Yes yes, I know: to “assume” makes an ass out of “u” and “me.” But that’s great! Because that just means that if we can get “u” and “me” out of the business of “ass,” we can just let “ass” take care of itself and you and me can move on to cooler things.

Anyhow, shoot, all this rambling nonsense, and we haven’t even gotten to discussing the 2nd law!

The 1st law looked primitive and obvious, but revealed depths. The second law looks complex, but is really much simpler: if you run the map operation twice, using a different function each time, then the result will always be the same thing as composing those two functions together and passing that to a single .map() interface.

Er, that was still too much of a mouthful: it merely means that repeated, chained usages of .map() using various functions in sequence are identical to a single map operation which just uses all of those the functions composed together (in the same sequence). In other words, how you group the operations that a .map() performs shouldn’t matter.

Ok: if that’s still unclear, let’s just look at things from the perspective of the inner value that’s undergoing all these operations:

Note that I slipped compose in there: that’s because with the base Identity Functor, mapping twice is really just doing a form of functional composition!

In each case, our value, 1, is being transformed in exactly the same way and in exactly the same order. The value might end up still buried inside a Functor Type or instead exposed as a primitive value to all the world: the point is that the hand-wavy cruft of how we made this happen is irrelevant to what the value went through. It got - 1'd and * 20'd and that’s that. Those are the core operations we’re ultimately really interested in seeing happening, not all the meaningless scaffolding that surrounds them.

Knowing that these two operations (map(f).map(g) and .map(x=> f(g(x))) ) are (…that they MUST be) exactly equivalent allows us to seamlessly transform one into the other whenever it’s convenient. In the case of mapping over an Array, that’s useful precisely because the second format can be more efficient than the first. Why?

When you .map(f) over an Array, it logically must loop once through the entire Array, applying the function f to each item. That looping operation has a cost (especially for large datasets) and when you .map(f).map(g) you’re incurring the cost twice: two complete loops. But if you know that you can safely transform .map(f).map(g) into
.map( x=>g( f(x) ) ) then you know that the same operation can be done in a single loop. That’s one of the tricks that our transducers utilized, and it’s also the basis for lodash’s loop fusion.

So, those are the Functor laws. What’s so great about Laws? Hopefully I’ve already conveyed the idea that one great thing about them is what they don’t say. Certain things are rock-solid and predictable… other things are not; knowing what must remain dependable frees us to get creative. Because Const and Identity were both Functors, we could work with them using the same interface, .map(), even though they behaved in radically different ways.

But at this point you’re probably wondering: isn’t it time for a terrible metaphor? Indeed! So… what if Const and Identity were to get married, move in together, and have a baby?

Even after extracting ourselves from that terrible, pointless metaphor, that proposal might still seem like a crazy idea. After all, we already ruled out the possibility of .map() operations inquiring into the nature of their inner value. So while we could easily build it, we don’t really want a Functor that .maps one way for one set of values and then another way for a different set of values. Attempts to do things like that can lead to errors, after all: errors that .map() itself would be responsible for, rather that the function being applied. That would undermine the lawful the dependability of .map()

As we saw with Lenses, one way we can avoid such inquiries is to bake critical information into an operation just by our choice of Functor types. Imagine an api that, instead of retuning bare values, instead returned a value wrapped inside a Const or an Identity instead.

That does means that we can only deal with that response using the methods/interfaces that Const & Identity happen to share, which at the moment is only .map() itself. Is that super-limiting? Perhaps, but .map() essentially generalizes the idea of running a function on a value, any unary function. So while it’s definitely a limit on the syntax we can use, in some sense, it’s really no limit at all!

Why would we want to do this? Let’s look at an api-like example (the “api” here is entirely synchronous, to avoid extra, irrelevant complications):

By enhancing the api function with our two Functor Types, we’ve done two things:

  1. Enforced the use of .map() if we want to do anything with an api result
  2. prevented null responses from the api from ever causing errors

If you were instead doing this all imperatively, you’d surely want to handle null results sensibly and avoid errors. And maybe you’d tend to handle that in the displayUser function (i.e. have it check for nulls before doing anything that might cause a nasty error). But that complicates both the purpose and the ability to clearly name that otherwise simple function. It’s called displayUser, after all, not checkForResultAndPerhapsDisplayUser!

This is really important stuff, so let’s restate the above concern in a more abstract way: we have a function P that creates a value (well, most of the time) and a function C that does something with a value (a “Producer” and a “Consumer”). We know that, to avoid errors, something will have to figure out, for any given instance, if there’s actually a value from P worth running with C in the first place. In the world of imperative code, we’d probably stick a line like if (okvalue) { C(okvalue); } somewhere.

But if we’re holding to the principle that every function should have a single, intelligible, self-contained realm of responsibility, where should that conditional code live?

It’s not really appropriate for it to live in either P or C, right? P is concerned with producing values, as best it can, not deciding what might happen to them next. C is concerned with consuming and directly acting on values it’s passed. C might have its own idea about, and some protective logic around, what “acceptable” value is. But fundamentally, C doesn’t need to understand what the lack of a value implies: it might just assume that the value its passed is of the right type and execute code using it immediately. After all, if you don’t have any appropriate value in the first place, why did you run C at all?

So the only other possible place to put that logic, if we know it shouldn’t go in P or C, is whatever application code glues them together in practice: whatever compose the execution of one with the execution of the other.

Here’s one familiar functional way to do glue functions together: literal composition: i.e. compose(C, P)(x) or C( P(x) )

That would generally work if P always returned a reasonable value, but we know that sometimes it might not, in which case C would likely end up throwing an error, or at the very least running needlessly.

Where, in structure of composition, can we stick the equivalent of our imperative if{} conditional? There’s no sensible place I can see.

Even if we gave up on the “single-responsibility” principle and just had P do the check, we’d still have a big problem: in a composition of P and C, running P necessarily means that then C will run. If P chooses not to return anything in order to signal failure, C will still run… it just with undefined as its value.

That problem exists even if we introduce an intermediary function checkP that contains the if conditional somehow. What would it return in the case of no value? In the new composition, compose(C, checkP, P), the function C is still certain to run no matter what checkP does or doesn’t do: composition just works out to C( checkP( P(x) ))) after all. Nothing checkP could return could prevent C from doing something. Which means that C would yet again have to know how to validate the received value to avoid a bad result… thus defeating the whole purpose of checkP.

It’s worse than that though: even if there was some way to make all that work, what should if(???){} check? That might seem obvious: it would roughly check for a “sensible” value: perhaps if(_.isObject(x)){ }, right?

But note that doing this is actually stealing a concern that more properly rests entirely with P: whether or not it produced a value at all, and what even qualifies as a legitimate value, from P’s specific perspective. If P was later swapped out for something that returned a Number instead of an plain Object, our checkP function wouldn’t be valid any longer: we’d have to remember to re-factor it in order to handle Numbers as well. What a headache!

Keep in mind that 0, false, or even null and undefined are all “values” of a sort: it’s really up to P, not some later intermediary, to determine whether or not they are what it means to pass along and considers a viable result. If C ends up with a different idea about what a legitimate value for C is, that’s its problem to handle. But by the same light, the needs of C shouldn’t pollute, restrict, or complicate P. Round and round we go.

Our little game of Const and Identity neatly solves this seemingly impossible problem. Because the internal composition is sequenced through Functors, P can now properly control whether (and perhaps… when) any subsequent operations should/can be run on an inner value. Now there’s no extra check to complicate C, but there’s also no assumption in P about what might need to be done with its returned value (or lack of value). It’s instead expressing the meaning of value/no value through the choice of Functor Type.

Note that this approach is radically different from any variation on the idea of having P secretly return some sort of hidden “meta-data” (e.g. returning {id:5, invalid:true} to signify a failed result) along with its value. While clever, that would still force a consumer like C to know about and then look for that extra data. With the Functor interface, we’re doing something on an entirely different level: instead of passing “meta-data” along to the next step in the compositional chain to have to interpret, P is essentially pushing information back out to the outer scope of the computation itself. That computation, after all, is what decided P and C deserved some compositional connection with each other in first place. So it’s that computation, and not C, which needs to be clued-in when it turns out that they don’t.

You’ve probably been told that functions can only return a single value: even if they don’t return anything, they’ll still “return” undefined. But by orchestrating all of this through Functors and thus delegating things to their common interface, map, our functions can in fact capture not just values but also a “computational context” as well. One of which is… the context of not having any value at all.

Const is actually too complex for its task: it contains a value we never use. We can simplify it further to something we’ll call… Nothing:

Now we don’t need to pretend that there’s any value (even null) captured in a Const response.

The more refined flavor of this idea, which ties two types very similar to Identity and Nothing (often called Just/Nothing, and implemented as special subtypes of Maybe), is called the Maybe Functor. Yeah, yeah, you’ve probably heard of the Maybe Monad instead, but I already told you that all Monads are Functors, and we’re just talking Functors today.

When does a “Maybe” do a Nothing.map() and when does it Identity.map()? Well, we’ll cover Superdeterministic Functors eventually (they’re cool because they logically already know ahead of time what operations you’re planning to run on them, or else they wouldn’t work!), but Maybe is actually pretty simple: you just pick one of its two inner types in the first place (or use a utility interface like Maybe.fromNullable() to have it something pick one for you, given a value of unknown quality) and those types, with their delegated behaviors, are passed along in the computation.

We’re now seeing the true face of Functors. The one most people are used to, Array, is relatively dull (encapsulating the concept of multiple possible values over which you might want to perform operations), and probably restricts the imagination a bit. Different flavors of Functors can encapsulate and sensibly handle anything from values that might not actually exist, to values that might not exist yet (Promises), or even values that are functions whose inputs or outputs we want to transform.

--

--

Drew Tipson

Primarily Javascript, potentially personal, possibly pointless. I welcome and am fascinated by your many marvelous opinions.