The Five Stages of JSON Decoding in Elm
--
Introduction
During my first six months of Elm, I focused my attention primarily on learning the languages basics and, most importantly, learning to translate my object-oriented patterns for scalable, maintainable architecture to Elm’s paradigm. In order to focus on those things, I intentionally ignored certain Elm domains, such as server communication, ports, and JSON decoders.
In October 2017, I decided it was time to take on a more ambitious project. I was going to build out an Elm Test visualization in Atom, which I dubbed Elm Test Runner. For communicating between Atom and my Elm application, I needed ports, and in order to work with Elm Test output, I needed JSON decoders.
JSON decoders are a notorious pain point for newcomers to Elm. As I built Elm Test Runner, I noticed myself going through various stages of understanding. In this article, I’ll explore each of those stages, in the hopes that following my path will help others reach a better understanding as well.
Stage 1: Everything’s a Record!
The most natural seeming thing to do is to decode all of the JSON blob’s fields “as is,” paying no regard to how your application will use the data. Below is an abbreviated version of my first pass at decoding the “run start” event from Elm Test:
This strategy can only loosely be called “decoding,” since Json.Decode
is not involved. Instead, I relied on port functions to map the JSON data directly to a concrete record.
The fields testCount
, fuzzRuns
, and initialSeed
are strings of integers, which enter the system as strings in RawData
. The parse function (which is not really parsing) is transforming the RawData
into a RunStart Parsed
record, which is the shape of the data as my Elm application needed to use it.
This strategy minimizes friction initially, but trying to create a large system with only this strategy in your toolbox would create a number of problems. The biggest problem is the inherent inefficiency of decoding into one of shape of data and then performing various mutations to get that data into another form. Leveling up your JSON decoding ability is all about learning strategies for decoding a blob of JSON data of one shape into an Elm type of another.
Stage 2: Dealing with Variant/Missing Fields
The first forcing function that pushed me out of manual pseudo-decoding was the discovery of variable fields. It turned out that, deep within Elm Test’s TestComplete
event, (which happens not once per run, but once for every test in the suite!) the fields actual
, expected
, first
, and second
could be missing. It seemed that if you had an actual
and expected
, you wouldn’t have a first
and second
, and vice-versa.
The flat record syntax could not accommodate this situation. In order to represent a field that only might exist, I had to slap a Maybe
type on the affected fields, and that Maybe
type prevented me from using that record on my port. That meant decoders.
I decided to encode my JSON data into a string, send that string over the port, and then decode on the other side. The end result looked like this:
This strategy is sub-optimal. It means I was taking a JSON blob on the JS side, encoding it, passing it through a port, and decoding it. And I still had that extra translation layer on the Elm side—my code had to check which two of the four fields existed and build a Parsed
of the appropriate shape out of the RawData
.
Stage 3: Trying Multiple Variants with oneOf
The variations continued to pile up. Elm Test has a robust API for expectations, filled with all different kinds of checks you can perform against your system’s data. It turned out that many of these checks varied in terms of the shape of the data their respective TestComplete
events would return.
This was where I discovered and learned Json.Decode.oneOf
. Maybe
only allows for saying, “this field is either here or it’s not.” The oneOf
function allows you to say, “first try configuration A, then B, then C, and only fail the decode if none of them match.” In other words, oneOf
is an essential structure for allowing you to decode into a union type.
My first time using this was at the discovery that instead of having actual
/ expected
or first
/ second
, a comparison could also simply be a string. My first implementation of oneOf
looked like this:
But, as I went to implement failure output for more and more Elm Test expectations, the variations just kept piling up, and I had to get really sophisticated with my oneOf
variants:
The really nice thing that happened at this stage was that all the Maybe
types went away! Instead of saying, “try looking for this field and plopping it in if it exists,” I was defining the many possible configurations the data could be in, and I now had a concrete union type to work with for each possible case, which drastically reduced the complexity of the helper functions that marshaled the data out to the rest of the system.
There was still the case, though, of that annoying JSON string encoding I was doing in the JS layer, and on every single event! If there were two hundred tests in a suite, that meant two hundred unnecessary encodes. What to do?
Stage 4: Introducing the Value Type
The answer was Json.Encode.Value
. This is a type wrapper for a JSON value (any JSON value), and it works on a port! I started passing my TestComplete
events through the port as Value
s and running decodeValue
on the other side instead of decodeString
:
This made the JS code and the Elm code both a lot nicer.
Stage 5: JSON Value/Elm Type Independence
Armed with all this knowledge, I was able to make the final leap into full JSON value/Elm type independence.
I started with those annoying integers that were strings. Instead of parsing out the integers after the fact, I made a decoder that was capable of producing a decoded integer directly from a numeric string in a JSON field. RawData
disappeared as a record type alias, and I could now decode the JSON of the various events directly into the Elm records that were already optimized for my system to work with:
My RunStart
and RunComplete
events, whose ports had looked weird with their RawData
signatures, sitting alongside TestComplete
with its Value
signature, could now also be run with Value
s.
In a yet unpublished project I’m calling “Elm Lens,” I expanded this same strategy in order to perform even more complex operations.
Here’s a how a Set
can be both encoded to and decoded from a Value
:
Here’s how a custom union type can be encoded to a string for JSON, and then back to a custom type for Elm:
And here’s a more complex operation. I’ve got a Dict (List String) (List CustomType)
that needs to pass between JavaScript and Elm. In order to get the dictionary keys to be valid JSON, I need to “hash” them into strings. In this case, I’m just joining up the list of strings into a single string with vertical bars between each list item:
JSON is now truly a transport mechanism, and my Elm code is serving its own needs rather than the needs of external systems.
Conclusion
If you find yourself struggling with JSON decoders, try to start with the simplest thing that could possibly work. Decode into a record type that matches the JSON structure exactly as it is. Once you’re good at that, move on to processing JSON data with variations in its structure, which will help you learn to adjust for discrepancies between JSON data fields and Elm record types. And once you’re fully comfortable with that, work on decoding the JSON data directly into the type of data that is convenient for your Elm application to work with in a single go. This will ultimately reduce the complexity of the systems that marshal the data out to the rest of your Elm application.
Further Exploration
- Elm Demystify Decoders is a series of JSON decoder programming exercises by Ilias Van Peer.