Typesafe, Monadic Parsing with Mapped Types in Typescript

“person plotting in map” by rawpixel on Unsplash

The process of string parsing comes up pretty frequently in development, but rarely do I stumble upon a solution I like as much as the one which I found today. My terms of success were straightforward: I wanted type safe parsing with the ability to compose Parsers while keeping the library’s api as simple as possible.

What is a Parser?

The first step was to define what a Parser is, which turns out to be quite simple.

So a Parser is any object with a run function that takes an input string and returns a ParserResult. The ParserResult then holds the parsed result and the rest of the string. (This sequential processing of a file is where we the monadic bit comes in). Both the Parser and the ParserResult are polymorphic in their output because they return the A at some point.

Our first Parser

With the definition in place it’s time to create our first parser. Here’s an example of a fixed width parser for string values.

In my case the file was a fixed width file so I knew the primary input I’d need was the length of characters to parse. A CSV parser might look for the next occurrence of a comma rather than simply doing a substring on the length for instance.

Typesafety in our parser

Now that we have the ability to handle our fixed width fields it would be nice to leverage that ability to parse specific types as well. There are various ways of writing type specific parsers, but I always prefer reusing code I already have to duplicating existing code.

Remember how I said that Parser<A> was polymorphic in its output? Turns out that’s a common sign that your type is a Functor. It also turns out that we can pretty easily define a Functor instance for our Parser<A> which makes all of the other type specific Parser definitions very easy to create.

Look at that! We simply defined map for our Parser<A> and were able to use that to modify the output of our Parser from string to number or Date! The “hard work” was in the map function itself, but even that isn’t too bad. We have a Parser that can generate an A and we have a function from A -> B, so the only logical thing to do is run the parser to get an A and then run the mapper function on the A to get the B.

Parsing Records

Ok so up until now we’ve done some parsing of primitive values, but how are we going to combine parsers to create records? Once I defined the Functor instance and had seen the sequential nature of the parsing in the ParseResult<A> I was certain that what I wanted was a Monad instance, but it turns out that the calling api for that approach is… less than elegant. Just look at how nasty for a record with only 4 properties:

If we had some general ability to use do notation within Typescript this code would have been much cleaner. https://github.com/russellmcc/fantasydo is an interesting example of using generator functions to simulate do notation and is the route I was taking when I found Mapped Types.

Mapped Types to the rescue!

A Mapped Type is a way to take one type and transform it into another type. Common examples include making all properties of a type readonly or optional. In our parsing example our Mapped Type is a one liner:

export type ParserDef<A> = { [P in keyof A]: Parser<A[P]> };

That signature says that a ParserDef is a type that maps all properties of another type [P in keyof A]. It then states that the value of each property will be a Parser that returns the same type that the mapped property is Parser<A[P]>. That may look and sound awkward, but let’s see an example to make sense of it:

Our FileHeader type has a site property that is a number. Our ParserDef<FileHeader> has a site property that is a Parser<number>. This approach ensures all properties have been defined and that each property is paired with a Parser that produces the same type of value that the property is.

Of course this is a ParserDef<A> now and not a Parser<A>, so we need a function to convert a ParserDef<A> to a Parser<A>.

We iterate through each property in the ParserDef and match the computation of running the Parser with the associated property. When we’re through with all of the properties we simply return our ParserResult and we can now create a Parser<A> from any ParserDef<A>.

Summary

I’m really pleased with how clean the end result is. It allows us to take small bits of code and recompose them to solve larger problems which is almost always the sign of success. We’ve also managed to reduce any chance that a type or property mismatch happens between interface definition and parser definition. Thanks for reading!


Here’s the full code in a single gist (complete with the chain implementation).