Divide and Conquer with Algebraic Structures

Brian Lonsdorf
5 min readJun 1, 2016

--

Recommended listening during read:

https://open.spotify.com/track/29h3NmtIybnBQDJiTM8TuB

Tl;dr

Pure functional data structures allow us to focus on the composition of complex, effectful workflows through reasoning and interchangeable instances. We can stub functions to return these types, get the composition right, then circle back to implement the functions. Also, traverse kicks ass.

Prereq

To get a feel for algebraic structures see https://github.com/fantasyland/fantasy-land.

Fix me a Fixture

It’s no secret that I enjoy functional programming. Writing code this way typically involves two steps:

  1. Make some tiny functions
  2. Compose those tiny functions

Besides the reuse, simplicity, etc, one reason I enjoy this approach is that it enables me to focus on one part at a time — composition or the function implementation. This comes in handy when dealing with a complex function or workflow.

Consider this problem: We’d like to load these fixtures into a db and return a matching data structure of created records. So, looking at the data below, dino_jr will be our fixture name that points to an instance of the Band model and so on.

Data for our fixtures

What’s more, we want to replace the “x_id” references to other fixtures (e.g. band_id and venue_id in the dino_stubbs event) with the actual database ids from the surrounding fixtures — by doing so, our ORM will take care of relationship methods.

If all goes well, we should be able to write something like:

Our goal api

Think for a minute on how you might approach this issue without pure functional data structures. Can you imagine the mutation? Callbacks firing off without warning. And the loops, oh goodness, the nested loops! I suspect it’d feel a little like tuning up a moving car.

Composing with algebraic reasoning

To tackle this issue, let’s focus on the composition rather than the functions here. We’ll need to traverse (and preserve) our data structure while inserting database records, which deals with multiple asynchronous actions. Then we’ll need to coordinate a second pass once all is created to add the relationship ids — another set of async actions.

There are a few composition hints here: We’ll need to traverse the data structure as well as coordinate (i.e. chain) a second pass. We’ll need an effect capturing type to model async.

We can swap out any algebraic data structure with another and preserve compositional reasoning.

To handle the async coordination, I would normally jump straight in and work with Task, which is a lazy promise (see https://github.com/folktale/data.task). However, in this case, we have a complex composition and Task is a bit opaque. Not to worry, there’s a useful trick: We can swap out any algebraic data structure with another and preserve compositional reasoning.

Since we can swap types, let’s choose the benign Id type to capture effects since it is easy to inspect, then trade it in for Task once we have our composition down.

I’ll use Id from https://www.npmjs.com/package/fantasy-identities and Immutable.Map from https://www.npmjs.com/package/immutable-ext.

Mapping Maps

How do we surgically replace nested objects while maintaining structure? Let’s try map:

Using Map

That alters each value, but the result type: Map({a: Id(2), b: Id(3)}) is undesirable. The values are all bottled up in Id and hard to get at.

traverse is a general function that’s like map, but it flips two container-like data structures around while running the transform function. So if I have, say, a Maybe(IO(x)) I can get a IO(Maybe(x)) or if I have a List(Task(x)) i can get Task(List(x)).

Let’s look at how the function traverse works on Map

Using Traverse

By using traverse we ended up with Id(Map({a: 2, b: 3})) which is much nicer. We have one Id on the outside and all the values are left unbottled, as it were.

Keep in mind that if we replaced Id with Task, every place that’s inside the type would be async. Therefore, in this example, we’d have 1 0utter Task that would tell us when all the async work inside those objects had finished.

But this sketch, doesn’t quite match our needs. For our fixtures, we’d like to dig two levels deep and change the values while maintaining the structure.

Can we just traverse twice?

Double traverse workflow

Yes we can! We can place our inner objects into an effect capturing type and have it bubble all the way out all whilst maintaining our object structure.

Plugging it together

So following our sketch above, we want to traverse our tables, then traverse the inner instances. Let’s write our main function, makeFixtures.

Traverse our fixture data

I’ve faked a createRecord function. Since I know createRecord is going to have an effect, I stubbed it with Id. I can then circle back and implement it later. This is what that whole focusing on one thing business was about.

So at this point, we have our first pass (record is a placeholder for the db instance):

Intermediate data structure of created records

Next, we have to do the whole double traversal thing again to update each record with its associations. We use chain (aka flatMap) to sequence effects. Again, I’ve faked an associateRels function for now so we can focus on composing, but it doesn’t matter what it does as long as it returns our Id type.

Sequence our traversals

Now that we’ve got the composition down, all that’s left to do is implement the functions for real and swap out Id for Task. Since they are just db writes, I won’t bore the reader with implementation.

Conclusions

The common apis like map, traverse, chain give us a framework of reasoning and help us compose the nasty parts like async, error handling, and loops. As we saw with Id, they even allow us to swap out one type for another.

We were able to divide and conquer in a completely different way than one might expect coming from an imperative background. Now, time to use these fixtures to write some tests!

*Art by Alevtina Khabibova

--

--