Crushing boilerplate with Scala macros

‘Since their release as an experimental feature of Scala 2.10, macros have brought previously impossible or prohibitively complex things to the realm of possible’ — Eugene Burmako

One day we took on the problem of where our developers were losing time due to mindless, recurring boilerplate they were forced to write, over and over in each and every project. We went through various Scala code-bases we had been writing or maintaining, trying to identify those repetitive patterns. We were bent on freeing everyone from having to write it again and again. Ideally, we wanted to push as much of these to a common library that we could reuse throughout existing and future projects. What’s also an interesting question is what Scala, as a language, lacks that makes it susceptible to boilerplate code slipping in.

I’m presenting our findings below. Of course, these are highly correlated with the code you’re typically working on. That said, I’ll be delighted to hear of your sources of boilerplate. Please shoot me a comment with your observations. As you might imagine, we ended up writing and publishing a library dealing with our boilerplate (but presumably useful to other people). It enables us to replace swaths of code like:

with one-liners like:

We’d love to extend it further, so if you had an itch to scratch, we could add it to the lib.

Slick mappings for case-class wrappers

More often than not you want to add some type-safeness to your domain. So, instead of case class Person(userId: Long, emailAddress: String, fullname: String), you do case class Person(userId: UserId, emailAddress: EmailAddress, fullName: FullName)

Source of boilerplate

If you want to persist these to a db — you’ve got to write Slick mappings not only for Person, but also for each of these small wrappers manually.

That’s a lot of code for essentially stating that 1-element tuple is isomorphic to its only element!

Possible solutions

You might be tempted to solve it by some generic mapper that maps from 1-element case classes to its only field. Unfortunately, you cannot do it in vanilla Scala in a perfectly type-safe manner! The best we could come up with is

This is not really type-safe (cast is involved), and you cannot restrict types to cover only 1-element case-classes, as any case-class will inherit Product
 
The other possibilities include:

  • mixing in MappedTo — but some people rightly believe that this solution is bad, because it couples domain with storage too tightly.
  • using shapeless

Shapeless offers truly wonderful generic product encodings under the name of Generic. So the above example could be written in completely type-safe manner:

Shapeless approach seems like a flawless solution (although you might argue that adding shapeless as a dependency is a bit too much for such a simple thing, not to mention that you need to have at least one person onboard who is familiar with all its magic) until you try to map one-column tables.

will fail with:

type mismatch;
[error] found : slick.lifted.MappedProjection[Name,String]
[error] required: slick.lifted.ProvenShape[Name]
[error] override def * = name <> (Name.apply, Name.unapply)
[error]

You may be scratching your head for a long time trying to decipher what’s going on here. The reason will become obvious after turning on logging implicits resolution.

lifted.this.ProvenShape.proveShapeOf is not a valid implicit value for slick.lifted.MappedProjection[Name,String] => slick.lifted.ProvenShape[Name] because:
[info] ambiguous implicit values:
[info] both method mappedProjectionShape in object MappedProjection of type [Level >: slick.lifted.FlatShapeLevel <: slick.lifted.ShapeLevel, T, P]=> slick.lifted.Shape[Level,slick.lifted.MappedProjection[T,P],T,slick.lifted.MappedProjection[T,P]]
[info] and method repColumnShape in trait RepShapeImplicits of type [T, Level <: slick.lifted.ShapeLevel](implicit evidence$1: slick.ast.BaseTypedType[T])slick.lifted.Shape[Level,slick.lifted.Rep[T],T,slick.lifted.Rep[T]]
[info] match expected type slick.lifted.Shape[_ <: slick.lifted.FlatShapeLevel, slick.lifted.MappedProjection[Name,String], Name, _]

Turns out you just unwillingly provided two ambiguous mappings for Name. One — repColumnShape — materialized with help of our catch-all implicit, and the other — mappedProjectionShape — which you almost wrote yourself by providing table mapping *.

So a truly perfect solution could disable automatic derivation if some other derivation already existed.

Our solution

How are both MappedTo and Generic implemented in spite of Scala’s inability to synthesize useful generic case-classes? After all, what they provide is the very thing we miss from Scala — generic access to a case-class representation. Well, Scala has one powerful escape hatch that lets programmers do things that language doesn’t let them do and it even circumvents the language rules. You probably heard of it before. `Tis macros!!

That’s how MappedTo and Generic are implemented under the hood. Just because there is no way to do it in Scala, doesn’t mean that it’s not possible by manipulating source code programmatically.
 
Macros, among other useful things, can be used to materialize implicits. There is even a special flavor of macros called white-box macros that, when chosen by implicit resolver to provide an implicit, can decide if they materialize it, or bail out and not provide it all. This ability makes them useful to meet our extra condition — disable automatic derivation based on context!

Here is the implementation. Using it, we were able to go from this:

to:

We also added support for slick_pg, enumeratum, using trait instead of import. And don’t forget about the support for one-column tables. :-)

Spray JSON formats

spray-json is commonly used as a JSON library and I have to say it loud — it’s goddamn awful at its job. :-) It was one of the biggest sources of boilerplate I could find. Partially, it can be attributed to Scala case-class handling, but a lot of it is its own shortcomings

Source of boilerplate

Just take a look at this example:

The most striking thing is that you have to manually count arity of case-classes to use appropriate jsonFormatN method. Awful! Why didn’t they provide something like play-json generic jsonFormat? Also, if you want a flat format because a case-class is just a wrapper, you need to write it manually (and you cannot do it correctly because Scala). spray-json has two possible base formats to choose from — JsonFormat and RootJsonFormat so if you do not want flat style for some classes, you need to indicate it manually as jsonFormat1. This design makes it hard to use jsonFormat implicitly because it would have been ambiguous to have implicits for RootJsonFormat and JsonFormat in the same scope (RootJsonFormat is subtype of JsonFormat). You could tinker with implicits’ prioritization but it’s not as great as you might have imagined. While you could make RootJsonFormat (the more specific one) a higher priority implicit, you currently are unable to make it the other way around (try this if you do not believe it!). In other words, if you preferred flat JsonFormat instead, you would be screwed.

Possible solutions

Switch to play-json :-)

Other than that, you could use shapeless, but it can get a bit hairy. See this for some excellent code. But still you could not easily mix flat and non-flat formats in one scope.

Our solution

Using white-box macros we:

  • automatically generate RootJsonFormat for any case-class or flat JsonFormat for 1-element case-classes
  • switch between formats depending on context

Because white-box macros can decide whether they materialize an implicit or not, we are able to simultaneously bring two implicits into scope:

and make the macro implementation smart about which one to pick. It prefers flat format when it comes across 1-element case-classes. So in cases like this:

case class ThingId(uuid: UUID)
case class ThingName(name: String)

case class Thing(id: ThingId, name: ThingName, …)

it’ll do what you expect — {“id”: “uuid”, “name”: “str”}. But it also takes into account if you want RootJsonFormat or not. For instance — case class Error(message: String) in Conflict -> Error(“Already exists”) will be formatted as {“message”: “Already exists”} in JSON.

This enables us to replace all manual formatting with one-liner:

and everything works as expected.

Sometimes you might also want field names to be transformed in some way (e.g., snakified). That’s where macros really shine. The transformation can be done completely at the compile time and emit only string constants at run-time, while when writing it manually you’d usually compute names at run-time, thus paying some price in performance. This use-case is supported by the library.

Flat JSON formats in play-json

To be honest, play-json has never been a source of extensive boilerplate for me, thanks to Json.format macro. Only flat formats are a minor pain and have had to be written over and over. So if you find yourself writing lots of code similar to:

…you can delegate it to the macro we wrote for this.

Is it Scala’s fault?

I’d say that pretty much, yes. When you think about it, all this redundant code wouldn’t have been necessary if you were able to:

  • treat case-classes as instances of respective ProductN
  • access case-class companion’s apply method in a generic way

Had UserId automatically extended Product1[Long] (instead of Product), you could have written a generic Slick mapper from all Product1[T] subtypes provided that T has TypedType instance. You would need to be able to access its companion’s apply to construct an instance, and this should have been baked into ProductN type — pretty much like collection classes have access to GenericCompanion[Repr] via companion method. I can’t find any logical explanation why compiler should not provide it for us.

Apart from case-class woes, one might want better implicit prioritization. The current implementation (implicits defined in a type are preferred over the ones from a subtype) is a one-way street. You can only request the more specific implicit to be prioritized over the less specific one. You want it another way? No luck.

So this one works correctly:

but this one fails to compile:

Summary

Based on examining a few code-bases, we found that dealing with case-classes is a major source of boilerplate. Case-classes are the bread and butter of programming in Scala, so it would be beneficial to have better support for generic case-class manipulation. Fortunately, this drawback can be overcome by macros. On the other hand, macro programming in Scala is not as easy as it should be:

  • API is not clearly documented.
  • API allows you to generate wrong code. It’s not unlike managing memory in C — correct semantics are not enforced and you’re on your own discovering what you did wrong. Search for owner chain corruption to see what I’m complaining about
  • Debugging is a pain.

Getting it right is not an easy task. There are also some disadvantages to using macros that should be pointed out:

  • Because macro programming is very fragile in its current state, you need to test it a lot to be sure it does not break a legitimate code.
  • Compilation time will suffer (we found ca. 10% growth in projects with number of serialized DTOs being about 40). That’s especially true if you do a lot of white-box macros.
  • The nature of white box makes it materialize implicit at every expansion site. Figuratively speaking, it is like ‘pasting’ the generated code whenever it is used. This can cause more allocations (although we benchmarked it, and it seems that there is no performance drop. Probably because all these allocations are short-lived closures) and will certainly increase the size of compiled program (this will be mitigated by Scala 2.12 delambdafy allocation scheme).
  • You have to keep up with Scala’s reflect API changes and it can change considerably when dotty comes (but, no, macros are not going away)

But there are also obvious advantages:

  • You do not have to write it!
  • No slow run-time reflection, e.g., conversion to snake_case in JSON format.
  • You can do amazing things that bend language rules, e.g., smart JSON formats.

If you’re interested in a detailed post about implementation, please shout — I’ll write one.

Please check out the library!

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.