Hylomorphisms

Composing catamorphisms and anamorphisms

This is part of my series on recursion schemes.

In the last two posts in this series we looked at catamorphisms, which collapse a nested data structure into a value, and anamorphisms, which build a nested data structure up from an initial seed value. If we combine the two, transforming one value into another by building up a recursive structure and then collapsing it back down, we have a hylomorphism. In this post we’ll look at the implementation of hylo, the function which creates hylomorphisms, and see that it is more memory efficient than composing catamorphisms and anamorphisms directly. Then we’ll look at a few example hylomorphisms, see how they relate to the call graph of recursive functions, and the tradeoffs that come with using them. And we’ll look at a whole lot of diagrams along the way.

The choice of the prefix “hylo” for this morphism is not clear to me, at least when compared to “ana” and “cata”. “hylo” is from the Greek word for “matter”. (In fact, “hylomorphism” is the name of an Aristotelian concept in which all being is seen as a combination of form and matter. In the philosophical case, “morphism” is used in the sense of “shape”, whereas in the software case it’s more like “transformation”.) As we’ll see, hylomorphisms make the call graph of a recursive function concrete, so I think of hylomorphisms as allowing us to provide a data type which will be the basic matter out of which a transformation is will be built. (Yeah, it’s a stretch.)

If you had a function buildUp, which was an anamorphism, and a function tearDown, which was a catamorphism, then the function x => tearDown(buildUp) would be a hylomorphism. However, this function would be very memory inefficient, because it would store the entire intermediate structure (the output of buildUp) in memory before starting to collapse it. If buildUp is defined by passing a coalgebra coalg to ana, and tearDown is expressed by passing an algebra alg to cata, we can instead create a hylomorphism by passing alg and coalg to hylo. hylo is implemented as so:

export function hylo<F, A, B>(functor: Functor<F>,
algebra: Algebra<F, B>,
coalgebra: Coalgebra<F, A>,
seed: A) : B {
const builtUp = coalgebra(seed),
mapped = functor.map(
x => hylo(functor, algebra, coalgebra, x), builtUp),
flattened = algebra(mapped);
return flattened;
}

Proofs exist which demonstrate that a hylomorphism implemented in terms of an algebra, a coalgebra and map are equivalent to the composition of an anamorphism made from that coalgebra and a catamorphism made from that algebra, but they are beyond my ability. Instead, I’ll try to demonstrate that the two are equivalent by looking at some example hylomorphisms and stepping through their execution.

Let’s see hylo in action operating on our running expression AST example. We’ll use an algebra and coalgebra from my previous catamorphism and anamorphism posts:

const evalAlg = expression => {
const ex = prj(expression);
return (
ex instanceof Plus ? ex.left + ex.right
: ex instanceof Times ? ex.left * ex.right
: ex instanceof Paren ? ex.contents
: /* ex is a Num */ ex.value);
};
const factorialCoalg =
x => x instanceof Num ? x
: x === 1 ? new Num(1)
: /* x > 1 */ new Times(new Num(x), x - 1);

To find the factorial of a number, we’ll generate an expression form of the factorial, and then evaluate that expression. All we need to do is compose our algebra and coalgebra using hylo:

function factorial(x: number) : number {
return hylo(exprFunctor, evalAlg, factorialCoalg, x);
}
console.log(factorial(4)); // Prints "24"

We can use a similar technique to compute numbers from the fibonacci sequence:

const fibonacciCoalg = x =>
x <= 2 ? new Num(1) : new Plus(x - 1, x - 2);
function fibonacci(x: number) : number {
return hylo(exprFunctor, evalAlg, fibonacciCoalg, x);
}
console.log([5, 6, 7, 8].map(fibonacci)); // Prints [ 5, 8, 13, 21 ]

Let’s look at the implementation of hylo again; we’ll step through this using our fibonacci example above to find the fourth fibonacci number. At each step I’ll show a diagram of the data structures we’ve operated on; structures that are currently held in memory will be outlined in red.

export function hylo<F, A, B>(functor: Functor<F>,
algebra: Algebra<F, B>,
coalgebra: Coalgebra<F, A>,
seed: A) : B {
const builtUp = coalgebra(seed),
mapped = functor.map(
x => hylo(functor, algebra, coalgebra, x), builtUp),
flattened = algebra(mapped);
return flattened;
}

The first thing we’ll do is pass our argument 4 to our coalgebra, fibonacciCoalg. Our coalgebra returns Plus { left: 3, right: 2 }.

Now we step into this with map. map will operate on the contents of left first, passing 3 to a recursive call to hylo. hylo passes this to our coalgebra, giving us Plus { left: 2, right: 1 }.

Again, map takes us into the left side first, making a recursive call to hylo with the seed 2. And again, hylo passes the seed to our coalgebra, which returns Num { value: 1 }.

We call map on this, but mapping over a Num just returns the Num unchanged, without executing the passed function (by the definition of map for our AST here.) We thus continue past the map and pass the Num to our algebra, which evaluates it and returns 1.

We’re in the middle of call to map in the second node, which is halfway through the construction of a Plus node that’s going to be passed to our algebra. Now map continues down the right side of the Plus, calling hylo with the argument 1. This is the same as our previous step; we get back Num { value: 1 }.

The same as before, our algebra evaluates this node to 1 and returns it. This goes into the right side of the Plus node that map was creating. Now map returns our new Plus node:

Now we’re back in the second node. The Plus node that was returned by map gets passed into our algebra, which evaluates it to 2, and returns to the map call from the first node.

map now works down the right side of this Plus, calling hylo with a seed value of 2. As we saw above, this results in our coalgebra producing a Num node holding a 1:

Our coalgebra resolves this to 1 and the first node’s map call resolves:

Finally, our algebra resolves the result of map at the top level, giving us the final result of 3.

We can see how this is more efficient than a straightforward composition of a catamorphism and anamorphism. At most we had three objects in memory at any given time, but if we were to pass our coalgebra to ana and invoke it with a seed of 4, we’d end up with the following tree:

This full tree has five objects. Conceptually, hylo runs on this tree, except that it doesn’t generate the right side of the Plus nodes until it’s done operating on the left. This is where we get our memory savings, and it results in our hylomorphism effectively doing a depth-first search over this tree.

This is cool, because we’re able to think about our algorithm in terms of full data structures, while our library functions take care of executing them in a more efficient way. This is par for the course for recursion schemes; we’ve brought ourself closer to declarative programming by thinking in terms of specific operations over a large “virtual” data structure, while the schemes themselves take care of figuring out how to create and traverse over this structure.

There’s another important thing to note about the diagram above. Consider a standard, naive implementation of a fibonacci function:

function recursiveFibonacci(x: number) : number {
return x <= 2 ?
1 :
recursiveFibonacci(x - 1) + recursiveFibonacci(x - 2);
}

Let’s draw a diagram of the call graph for recursiveFibonacci(4):

Each node here is an invocation of recursiveFibonacci. Evaluation here moves through this call graph like a depth-first search, drilling down and moving from left to right. This is identical to the structure and execution of our fibonacci hylomorphism!

This shows the true nature of hylomorphisms. A hylomorphism describes a recursive algorithm by making that algorithm’s call graph into a concrete data structure, and then executes the algorithm by traversing over this call graph. Hylomorphisms thus make the divide-and-conquer nature of recursive algorithms explicit. A coalgebra is responsible for the “divide” step, and expresses how a problem should be broken into subproblems at each step of the algorithm’s execution. Based on this view, an anamorphism created from this coalgebra would generate the call graph of this recursive algorithm. Meanwhile, algebras are responsible for the “conquer” step, and express how the results of subproblems should be combined. A catamorphism created from this algebra evaluates a call graph by collapsing it down to a single result.

The correspondence between the intermediate data type used by fibonnaci above and the call graph of its naive implementation is somewhat obscured by our usage of the AST type. Let’s bring this correspondence to the fore by modeling the call graph directly.

Looking again at the definition of recursiveFibonnacci given above, we can see that it makes two calls at each recursive step, and terminates in a step with no recursive calls. This structure corresponds to a tree. We can thus rewrite our fibonacci function as a hylomorphism using a binary tree of numbers as its intermediate type.

function fibonacci(n: number) : number {
return hylo(
numberTreeFunctor,
tree => {
const t = prj(tree);
return t instanceof Node ? t.left + t.right : t.contents;
},
n => n <= 2 ? new Leaf(1) : new Node(n - 1, n - 2),
n);
}

We can see how the algebra and coalgebra make the structure of our algorithm explicit here: when breaking our problem into subproblems, our coalgebra produces either a subproblem with two parts or a base case with the value of 1. Our algebra makes the handling of the possible cases explicit as well: for our base case, Leaf, the contained value is returned unchanged. For our Node cases, we unify the solutions to each of our subproblems by adding the result of each side of the Node.

Using a binary tree as our intermediate data structure, we can represent any recursive algorithm that makes two calls at each step as a hylomorphism. For an algorithm that made one call at each step we could use a linked list, for one that made four at each step we could use a quadtree, etc. It turns out that any well-behaved recursive function can be transformed into a hylomorphism, with a few restrictions. (Though the proof of this requires some serious math.)

It was easy to make a case for when catamorphisms and anamorphisms ought to be used, because they have a specific structure that fits certain algorithms. Hylomorphisms, though, can express any straightforwardly recursive function, and therefore can also easily be replaced by an explicitly recursive function as well. So when should we use each?

We’ve established that hylomorphisms have the benefit of making the structure of an algorithm explicit. In something like the fibonacci function, this structure is straightforward enough that this benefit doesn’t make a large difference, but if an algorithm has a complex and irregular structure, modeling this explicitly as a data type might be beneficial. On the other hand, there’s a substantial amount of boilerplate involved in creating the intermediate data structures used by hylomorphisms.

In Haskell, which is properly the home of recursion schemes, the optimizing compiler can turn hylomorphisms into incredibly efficient code, including eliminating any intermediate data structures. This means that implementing an algorithm in terms of a hylomorphism can actually improve its performance over a plain recursive implementation. Unfortunately we don’t have such a benefit in JavaScript. In our case, hylomorphisms not only fail to enable any additional optimizations; they also add a not-insubstantial overhead.

For these reasons, hylomorphisms are probably best saved for complex cases where their additional structure is beneficial, for cases where fixed-point forms of datatypes are already available, or for times when we already have algebras or coalgebras which express the transformation we’re interested in.

There are some more benefits in niche cases, as well. For example, our fibonacci function performs a huge number of duplicate calls, and has exponential complexity. In my earlier post on variations on the Y-combinator, I showed that memoization can convert an exponential-time fibonacci function into a linear-time one. We saw that by expressing a recursive function in terms of the Y-combinator, we could easily switch to an alternative Y-combinator which memoized any recursive function. Expressing recursive functions as hylomorphisms would let us do the same thing, if we wrote a function with a slightly modified signature:

function memoizedHylo<F, A, B>(functor: Functor<F>,
tearDown: Algebra<F, B>,
buildUp: Coalgebra<F, A>,
getHashCode: B => string,
seed: A) : B;

As with the Y-combinator, additional variations are possible.

Hylomorphisms are a general way of expressing recursive computations, and are equivalent to a catamorphism composed with an anamorphism. They are also equivalent to basic recursive functions, and make these function’s decomposition step, recombination logic, and call stack structure explicit. Their equivalence with recursive functions means that oftentimes a normal recursive function will be a more direct and efficient solution than a hylomorphism, but they can still be a useful and powerful tool for dealing with recursive problems.

All code samples from this post may be found here.

Thanks to Lizz Katsnelson for editing this post.

JavaScript developer with a focus on typed functional programming. He/him. https://jnkr.tech