Typesafe, Monadic Parsing with Mapped Types in Typescript
The process of string parsing comes up pretty frequently in development, but rarely do I stumble upon a solution I like as much as the one which I found today. My terms of success were straightforward: I wanted type safe parsing with the ability to compose Parsers while keeping the library’s api as simple as possible.
What is a Parser?
The first step was to define what a Parser is, which turns out to be quite simple.
So a Parser
is any object with a run
function that takes an input string
and returns a ParserResult
. The ParserResult
then holds the parsed result and the rest of the string. (This sequential processing of a file is where we the monadic bit comes in). Both the Parser
and the ParserResult
are polymorphic in their output because they return the A
at some point.
Our first Parser
With the definition in place it’s time to create our first parser. Here’s an example of a fixed width parser for string values.
In my case the file was a fixed width file so I knew the primary input I’d need was the length
of characters to parse. A CSV parser might look for the next occurrence of a comma rather than simply doing a substring
on the length
for instance.
Typesafety in our parser
Now that we have the ability to handle our fixed width fields it would be nice to leverage that ability to parse specific types as well. There are various ways of writing type specific parsers, but I always prefer reusing code I already have to duplicating existing code.
Remember how I said that Parser<A>
was polymorphic in its output? Turns out that’s a common sign that your type is a Functor
. It also turns out that we can pretty easily define a Functor
instance for our Parser<A>
which makes all of the other type specific Parser
definitions very easy to create.
Look at that! We simply defined map
for our Parser<A>
and were able to use that to modify the output of our Parser
from string
to number
or Date
! The “hard work” was in the map
function itself, but even that isn’t too bad. We have a Parser
that can generate an A
and we have a function from A -> B
, so the only logical thing to do is run the parser
to get an A
and then run the mapper
function on the A
to get the B
.
Parsing Records
Ok so up until now we’ve done some parsing of primitive values, but how are we going to combine parsers to create records? Once I defined the Functor
instance and had seen the sequential nature of the parsing in the ParseResult<A>
I was certain that what I wanted was a Monad
instance, but it turns out that the calling api for that approach is… less than elegant. Just look at how nasty for a record with only 4 properties:
If we had some general ability to use do notation within Typescript this code would have been much cleaner. https://github.com/russellmcc/fantasydo is an interesting example of using generator functions to simulate do notation and is the route I was taking when I found Mapped Types.
Mapped Types to the rescue!
A Mapped Type is a way to take one type and transform it into another type. Common examples include making all properties of a type readonly or optional. In our parsing example our Mapped Type is a one liner:
export type ParserDef<A> = { [P in keyof A]: Parser<A[P]> };
That signature says that a ParserDef
is a type that maps all properties of another type [P in keyof A]
. It then states that the value of each property will be a Parser
that returns the same type that the mapped property is Parser<A[P]>
. That may look and sound awkward, but let’s see an example to make sense of it:
Our FileHeader
type has a site
property that is a number
. Our ParserDef<FileHeader>
has a site
property that is a Parser<number>
. This approach ensures all properties have been defined and that each property is paired with a Parser that produces the same type of value that the property is.
Of course this is a ParserDef<A>
now and not a Parser<A>
, so we need a function to convert a ParserDef<A>
to a Parser<A>
.
We iterate through each property in the ParserDef
and match the computation of running the Parser
with the associated property. When we’re through with all of the properties we simply return our ParserResult
and we can now create a Parser<A>
from any ParserDef<A>
.
Summary
I’m really pleased with how clean the end result is. It allows us to take small bits of code and recompose them to solve larger problems which is almost always the sign of success. We’ve also managed to reduce any chance that a type or property mismatch happens between interface definition and parser definition. Thanks for reading!
Here’s the full code in a single gist (complete with the chain
implementation).