Alphabet Soup: Type-level transformations

8 min readApr 1, 2019

The library alphabet-soup can be found here.

Alphabet-soup is a scala library to provide faster compiler-time type-mixing, to reduce boilerplate and the cognitive overhead of reading and writing canonical code.

If there’s only one way to write a piece of code, why do we have to write it? The compiler should do it for us.

Caveat

One major difference about alphabet-soup’s approach to the usual type-transformation libraries is that it is entirely type based and not name based. If you map a case class to a case class, it does not care what the
field names are, only the types. (As a nice side-effect, it is faster to compile than these other libraries as a result).

For this reason, it works with tuples where other libraries may not. Also for this reason, everything you care about must have a unique type. You can’t
have duplication (just like you wouldn’t have duplication in field names). This seems like a lot of overhead, but in reality it’s actually very rare that you combine values from different concept-spaces. When was the last time you added a user’s first name to their username?

Be warned there’s a whole concept of Atom I have omitted below which is important to get the library up and running in your own project. It is detailed in the README of the library linked above.

The concept

There were a couple of use-cases which drove the development of alphabet-soup, one to do with working on a very abstract level and one to do with reducing needless boiler-plate scattered around projects.

#1 Reducing needless boilerplate

In several of my projects, there is a distinction between what I call ‘resource’ types that are exposed in an API, and ‘model’ types, which represent database structures or business-related code structures.

This is done for many reasons, chief among which is that if you want to change the database you shouldn’t automatically update your API to reflect this. Doing that should be hard, and a conscious choice. So there exists a mapping between these types, which the user (i.e. me) has to implement.

A really dull example is:

The from method’s definition is boring. At no point is there a choice to make, so how could I write it incorrectly? Yet I seemingly still need to write it.

Things get even worse when your resource is made up of multiple models, since different datatypes are stored in different tables. Imagine:

You can see how these from methods may grow and grow, and yet still contain nothing interesting. I’m bored just thinking about them.

I want to get rid of them.

#2 Automatic type projections

Imagine for a moment you have an aa: ApplicativeAsk[F, (PaymentId, Config, UserId)], where F can be whatever effect you like that supports the reader, say ReaderT[IO, ?].

If I wanted to extract an F[UserId] from this, I would have to write aa.ask.map(_._3) (or a more verbose variant).

Not too bad, but if I wanted to extract (PaymentId, UserId) the example would become:

Much worse. Why should we need to do that? There’s only one way to extract (PaymentId, UserId) from (PaymentId, Config, UserId): we shouldn’t have to write that code.

It gets even worse if you imagine the case where we have an implicit val aa: ApplicativeAsk[F, (PaymentId, Config, UserId)] and we need an implicit val aa2: ApplicativeAsk[F, (PaymentId, UserId)]. What do we need to do now? Well:

That just builds a new ApplicativeAsk for us. A monkey with a compiler could do it, since at no point is there a choice to make.

Why is this stuff not automatic? I want to remove the need for the monkey and just get the compiler to do it.

Implementation

Below is a rough-and-ready naïve implementation of our algorithm — the first version I wrote. Several bugs were uncovered caused by limitations in the compiler, and more structure added later as requirements came in from testing. But the underlying algorithm did not change from the below.

Mixer

So, we want something that takes a value of some type, A, and produces a value of some type, B. If it’s impossible to produce a B canonically from A, we want compilation to fail.

This is a pretty simple interface we want to implement. To keep with the alphabet-soup theme, I’ll call it Mixer:

Generic

Immediately when thinking about how to implement the above, we run into a problem. A is completely generic, and we can do nothing with it. Ideally it would have some structure we could recurse over.

Enter Generic, from shapeless. Generic will turn any type A into an HList, if it can. If it can’t, our entire Mixer concept is impossible for the type A. Generic works on tuples, hlists and case classes so it seems a pretty
good place to start.

An example:

This is very promising! Each of the three types on the left all contain exactly the same information, and Generic turns them into the same generic structure, allowing us to potentially rebuild them as we see fit later on.

One major problem though, Generic is only skin-deep:

Note the inner tuple on the right hand side — only the outer-most layer is transformed.

This won’t do. We need something that is completely de-structured on the right. We need to write our own thing which recursively applies Generic all the way down.

Atomiser

I called this new concept Atomiser; it takes types apart into its constituent components.

The structure is very, very similar to Generic:

We’re following the Aux pattern here, like most of shapeless does. If you’re not familiar with it, there will be a blog post in the future about why it exists and what problems it solves.

Note that Atomiser, like Generic, provides a way to go to the generic structure as well as from it, back to our real structure.

So, how do we implement Atomiser? Our structure is now potentially very complicated, as we basically have a tree of types to traverse.

Well the first thing we have to do is use shapeless’ Generic to transform our outer-most layer for us. Once we’ve done that, we search the tree recursively. I leave the implementations out because they’re boring, and canonical.

The first:

The recursive step:

And the termination condition:

The eagle-eyed amongst you may note that the compiler won’t know which to apply first, genericCase or headFirstSearchCase.

Because of this conflict in the domains of our implicits, genericCase must be made low priority. In pseudo-code, it will end up looking like this:

And we’re done. Those three provide a rough implementation of Atomiser. There’s actually a lot more going on under the hood, and several bugs will arise with the above in testing, but they’re just implementation details. There will be a future blog post on the super technical details and problems that you run into when tightening up the above.

The important thing is that algebraically, the above algorithm is correct.

AtomSelector

So now we have a completely generic tree structure of any applicable type. We can imagine that our first step in implementing Mixer[A, B] will be to apply Atomiser to both type arguments.

What do we then do?

We need to produce a B from A. So it makes sense to iterate through B’s tree structure, and for each element try to find the same element present in A’s tree structure. If there is such an element, take it and use it to build a run-time value of type B. If there is no such element, our compilation should fail.

So, we need a type class that will select a type from a generic structure, or fail compilation. Enter AtomSelector:

The actual implementations of the implicits have been left out below, since they’re canonical and can’t be written incorrectly.

Let’s start the implementation with an easy case where we just select the head of our list. Remember, we’re assuming L has been atomised:

That says, if we’re trying to select H and our type has H as the head element, just take it.

Next, we have to do some recursive step. We’re working on a tree so we actually need two recursive steps, one recursing on the head and one on the tail:

The first handles the tail case, the second handles the head. Both just assume that our desired element U is within the structure they recursively call.

And… we’re done. We don’t need a termination case, since you can’t select any type from HNil. If we don’t terminate, i.e. we recurse through the entire structure and cannot find a U, then we just fail compilation as desired.

Again, the reality is more complex and the reasons why and what extra layers need to be put where will be detailed in a future post.

Algebraically and naïvely though, the above is correct.

Stitching it together

Now we’re nearly ready to stitch it all together and implement our Mixer type class!

We’re actually going to implement something else first, which does the actual algorithm. It’s called MixerImplFromAtomised:

It’s exactly the same as Mixer, but assumes A and B have been atomised. This is purely for efficiency purposes, so we don’t have to re-atomise A and B in every implicit.

Let’s start with an easy case, the termination:

This says, “we can mix any type into HNil”. Which is true: we can always produce no structure from any structure, just by forgetting about the structure.

Next up is the case where the head of our list is something we can select (ie, not an HList):

This says, if we can select BH from A, and mix A into BT, then we can mix A into BH :: BT.

And the last case, low priority again, is where we must recurse on the head:

This says, if we can mix A into BHH::BHT, and we can mix A into BT, then we can mix A into (BHH::BHT) :: BT. We need to produce the entirety of the RHS, so the mixer algorithm must be invoked twice, once for the head structure and once for the tail structure.

And we have now covered all cases. Now…

The final step

We’re finally going to implement our top-level algorithm:

It’s thankfully very simple, since MixerImplFromAtomised did all the heavy lifting for us. We only need to atomise our types, and then invoke our algorithm:

And there’s one very specific case where we can shortcut the entire charade, the equality case:

Again, there are details omitted for clarity. The algorithm is correct, but a myriad of technical details arise when dealing with collections, or defaults, and specific problems with the scala compiler to get around, which will be detailed in the future.

Using it

So we’re done!

Now let’s use it in our two boring examples from above.

Example #1, Resources

See the use of Mixer in def from. One really nice thing about this approach is that we can forget about the field names, so if we want to produce some type we just need some structure that can satisfy all the constituent types. So we can just bung Payment and User into a tuple, and let the compiler figure it all out under the hood for us.

We could alternatively use Mixer[(Payment, User), PaymentResource] everywhere we needed it, rather than scoping within def from. That would come with slower compile times, since we’d be recalculating the structures every time, but it’s definitely doable to reduce the boilerplate even further.

Example #2, ApplicativeAsk

Just a note that, of course, the approach below will obviously work for other structures such as MonadState, etc.

The problem was we had an aa: ApplicativeAsk[F, (PaymentId, Config, UserId)] and we wanted an aa2: ApplicativeAsk[F, (PaymentId, UserId)].

Well:

Now,

Which is pretty cool!