How to get the best out of ppx_deriving

Étienne Millon
Cryptosense tech blog
4 min readNov 21, 2017

How to print a value, for example for debugging? The OCaml language does not have a magic function that can print any value, nor a way to define how a particular type should “print” its values (like toString in Java or __str__ in Python).

But a lot of libraries follow the following naming convention: to print values of a type t, one can rely on a function val show : t -> string. For other types, say type user, the function is val show_user : user -> string.

Writing these functions by hand is tedious and they are difficult to keep up to date when the type definition changes. So, a common solution is to have the compiler generate them using ppx_deriving. It adds a compilation pass that generates functions based on type declarations.

See this example with a variant type:

Quick tour of the common plugins

Ppx_deriving supports more than just pretty-printers. Here are the most important functions that can be generated:

  • “show” generates pretty-printers for types. This is useful for debugging, some error messages, or generally speaking serializing in a developer-friendly format.
  • “eq” generates monomorphic equality. It is usually a bad idea to use the polymorphic equality operator (=), because it can raise exceptions at runtime if it encounters functions, and because it cannot be customized. This is why having a monomorphic val equal : t -> t -> boolis a good idea.
  • “ord” generates a monomorphic comparison function. The polymorphic compare function has the same drawbacks as equality. In addition, some functors like Set.Make depend on a val compare : t -> t -> int function in a module, which is what this plugin generates.
  • “yojson” (not included in ppx_deriving itself, but in a separate ppx_deriving_yojson package) generates functions to serialize and deserialize a data type to/from JSON.

At Cryptosense, our style guide suggests to define eq, ord and show for all exposed types.

Customizing generated code

In most cases, the generated code just works and does not need any tweaking. But sometimes, it is necessary to use a bit of magic to change the behavior of generated functions:

  • types in third-party libraries often do not have the correct functions defined. Example: you have a date in a record type, but there is no show_date function.
  • sometimes, logical equality is different from equality of representations Example: you’re building a set from a tree of values, and you don’t care about the shape of the tree. This also applies to comparison.
  • there are external constraints. Example: you are interacting with a JSON API that omits null fields from records, whereas ppx_deriving_yojson expects a field with a null value.

Here are some ways to get the generated functions just right even in these cases.

1. Change the conversion functions

The first way to customize this is by using attributes on the type declaration. Attributes (“[@expression]”) are pieces of syntax that are ignored during evaluation, but are available to PPX rewriters.

When using @equal next to a type, this function will be used instead of the default. It can even be an anonymous function:

2. Use custom type aliases

Another way is to leverage the fact that PPX is just about code generation: each declaration does not “know” about other declarations. If a record type contains a field of type user, a call to show_user will be emitted, whether or not that function actually exists (if it does not exist, the compilation will fail later, but PPX rewriting by itself rarely fails).

So, we can define a type alias and the show function with the correct name, and it will be called by the generated code.

It is good practice to define a wrapper module for such types, so the types that need this trick do not have to reinvent it. For example, we use Zarith in our applications, and it lacks some of these conversion functions. So we redefined a module named Serializable.Z and we can just use Serializable.Z.t in type declarations and everything “just works”.

3. Define an alternate key for JSON

Sometimes in record types the fields are correctly converted, but the object itself is not quite right.

Let’s take an example. We are often dealing with cryptographic objects serialized as JSON. Some of these cryptographic objects are “key pairs”, a record composed of a public key and a private key.

The JSON representation is

{
"public": ...,
"private": ...
}

The natural way would be to define a type:

type pair =
{ public : key
; private : key
}

But this is not possible, because private is a keyword in ocaml.

Using ppx_deriving_yojson, it is possible to use the [@key] attribute to override the name of a field:

4. The derive-and-convert pattern

Sometimes the JSON representation we have to work with is not quite what we expect. It is always possible to write the conversion functions by hand, but manipulating JSON values directly is pretty painful.

An alternative is to have two types: one type that fits the representation closely, and one that the rest of the codebase uses. Then it is possible to write conversion functions between the two types, without manipulating JSON.

As an example, let’s consider an API that can return two kinds of messages:

{ "error": "Invalid API Key" }
{ "data": [4,8,15,16,23,42] }

It is possible to map it to a variant type by checking what fields are set.

The nice part about this solution is that we never have to manipulate the JSON type directly, so we don’t have to find data in objects, look for duplicate keys, etc. This part is entirely separate and dealt with by the generated code itself.

Conclusion

OCaml is a powerful language, but lacks a lot of information at run-time. Fortunately, it is also capable of just enough meta-programming to generate functions like printers or conversion functions using type definitions. As programmers, it is an opportunity to focus on what matters: business logic. For us, this means detecting and rating uses of cryptography in applications. If you’re interested in these topics, let’s talk!

--

--