The Five Stages of JSON Decoding in Elm

Introduction

During my first six months of Elm, I focused my attention primarily on learning the languages basics and, most importantly, learning to translate my object-oriented patterns for scalable, maintainable architecture to Elm’s paradigm. In order to focus on those things, I intentionally ignored certain Elm domains, such as server communication, ports, and JSON decoders.

In October 2017, I decided it was time to take on a more ambitious project. I was going to build out an Elm Test visualization in Atom, which I dubbed Elm Test Runner. For communicating between Atom and my Elm application, I needed ports, and in order to work with Elm Test output, I needed JSON decoders.

JSON decoders are a notorious pain point for newcomers to Elm. As I built Elm Test Runner, I noticed myself going through various stages of understanding. In this article, I’ll explore each of those stages, in the hopes that following my path will help others reach a better understanding as well.

Stage 1: Everything’s a Record!

The most natural seeming thing to do is to decode all of the JSON blob’s fields “as is,” paying no regard to how your application will use the data. Below is an abbreviated version of my first pass at decoding the “run start” event from Elm Test:

This strategy can only loosely be called “decoding,” since Json.Decode is not involved. Instead, I relied on port functions to map the JSON data directly to a concrete record.

The fields testCount, fuzzRuns, and initialSeed are strings of integers, which enter the system as strings in RawData. The parse function (which is not really parsing) is transforming the RawData into a RunStart Parsed record, which is the shape of the data as my Elm application needed to use it.

This strategy minimizes friction initially, but trying to create a large system with only this strategy in your toolbox would create a number of problems. The biggest problem is the inherent inefficiency of decoding into one of shape of data and then performing various mutations to get that data into another form. Leveling up your JSON decoding ability is all about learning strategies for decoding a blob of JSON data of one shape into an Elm type of another.

Stage 2: Dealing with Variant/Missing Fields

The first forcing function that pushed me out of manual pseudo-decoding was the discovery of variable fields. It turned out that, deep within Elm Test’s TestComplete event, (which happens not once per run, but once for every test in the suite!) the fields actual, expected, first, and second could be missing. It seemed that if you had an actual and expected, you wouldn’t have a first and second, and vice-versa.

The flat record syntax could not accommodate this situation. In order to represent a field that only might exist, I had to slap a Maybe type on the affected fields, and that Maybe type prevented me from using that record on my port. That meant decoders.

I decided to encode my JSON data into a string, send that string over the port, and then decode on the other side. The end result looked like this:

This strategy is sub-optimal. It means I was taking a JSON blob on the JS side, encoding it, passing it through a port, and decoding it. And I still had that extra translation layer on the Elm side—my code had to check which two of the four fields existed and build a Parsed of the appropriate shape out of the RawData.

Stage 3: Trying Multiple Variants with oneOf

The variations continued to pile up. Elm Test has a robust API for expectations, filled with all different kinds of checks you can perform against your system’s data. It turned out that many of these checks varied in terms of the shape of the data their respective TestComplete events would return.

This was where I discovered and learned Json.Decode.oneOf. Maybe only allows for saying, “this field is either here or it’s not.” The oneOf function allows you to say, “first try configuration A, then B, then C, and only fail the decode if none of them match.” In other words, oneOf is an essential structure for allowing you to decode into a union type.

My first time using this was at the discovery that instead of having actual / expected or first / second, a comparison could also simply be a string. My first implementation of oneOf looked like this:

But, as I went to implement failure output for more and more Elm Test expectations, the variations just kept piling up, and I had to get really sophisticated with my oneOf variants:

The really nice thing that happened at this stage was that all the Maybe types went away! Instead of saying, “try looking for this field and plopping it in if it exists,” I was defining the many possible configurations the data could be in, and I now had a concrete union type to work with for each possible case, which drastically reduced the complexity of the helper functions that marshaled the data out to the rest of the system.

There was still the case, though, of that annoying JSON string encoding I was doing in the JS layer, and on every single event! If there were two hundred tests in a suite, that meant two hundred unnecessary encodes. What to do?

Stage 4: Introducing the Value Type

The answer was Json.Encode.Value. This is a type wrapper for a JSON value (any JSON value), and it works on a port! I started passing my TestComplete events through the port as Values and running decodeValue on the other side instead of decodeString:

This made the JS code and the Elm code both a lot nicer.

Stage 5: JSON Value/Elm Type Independence

Armed with all this knowledge, I was able to make the final leap into full JSON value/Elm type independence.

I started with those annoying integers that were strings. Instead of parsing out the integers after the fact, I made a decoder that was capable of producing a decoded integer directly from a numeric string in a JSON field. RawData disappeared as a record type alias, and I could now decode the JSON of the various events directly into the Elm records that were already optimized for my system to work with:

My RunStart and RunComplete events, whose ports had looked weird with their RawData signatures, sitting alongside TestComplete with its Value signature, could now also be run with Values.

In a yet unpublished project I’m calling “Elm Lens,” I expanded this same strategy in order to perform even more complex operations.

Here’s a how a Set can be both encoded to and decoded from a Value:

Here’s how a custom union type can be encoded to a string for JSON, and then back to a custom type for Elm:

And here’s a more complex operation. I’ve got a Dict (List String) (List CustomType) that needs to pass between JavaScript and Elm. In order to get the dictionary keys to be valid JSON, I need to “hash” them into strings. In this case, I’m just joining up the list of strings into a single string with vertical bars between each list item:

JSON is now truly a transport mechanism, and my Elm code is serving its own needs rather than the needs of external systems.

Conclusion

If you find yourself struggling with JSON decoders, try to start with the simplest thing that could possibly work. Decode into a record type that matches the JSON structure exactly as it is. Once you’re good at that, move on to processing JSON data with variations in its structure, which will help you learn to adjust for discrepancies between JSON data fields and Elm record types. And once you’re fully comfortable with that, work on decoding the JSON data directly into the type of data that is convenient for your Elm application to work with in a single go. This will ultimately reduce the complexity of the systems that marshal the data out to the rest of your Elm application.

Further Exploration