Static code analysis in Javascript: verifying correctness of pattern matching

Table of Contents

  • Sum Types in Functional Languages
  • Tagmeme: An Implementation of Sum Types in Javascript
  • Problems with using Tagmeme
  • Tagmeme-analyzer: static type-checker and analyzer for tagmeme
  • How the analyzer works
  • A word about the awesome Javascript ecosystem

Sum Types in Functional Languages

Many functional programming languages, especially statically typed ones like F#, Elm, Haskell etc. have the concept of unions, also known as sum types: these are type definitions used to model values that can be one a of disjoint set of things. For example, you can model a Shapeto be either a circle, a triangle or a rectangle or you can define a Colorthat can only be red, green or blue. In a sense, they are like enumerations (enum) in C-style languages but there is a big difference: they can be parameterized with data. A circle has a radius while a rectangle has both height and width. The exact type of data can be encoded in the definition, this is how it looks like in F#:

The different shapes that a shape can be in this example (the circle, the rectangle and the triangle) are called the “union cases” of the Shape type. Each union case is considered a constructor of Shape :

Once you construct a value of a union, you can match against different patterns of the value, for example to determine whether a value is a circle, you match against all the cases the value can be:

One of the best features of modern compilers of functional languages is that they can detect exhaustive pattern matching: whether your code actually takes all possible cases into account and will give you warnings if it doesn’t which is pretty amazing.

Tagmeme: An Implementation of Sum Types in Javascript

Of course, I don’t have the luxury of using either F# nor Elm at my workplace as we primarily use Javascript for front-end development. Don’t get me wrong, I think Javascript is OK and I kind of like it too but I still miss the ability of domain modelling using an expressive type system, or at least with sum types.

Luckily though, there exist libraries in Javascript to remedy the situation, one of which I recently discovered is the hidden gem called tagmeme: a library for simple tagged unions by Chris Andrejewski. I really liked this library because of it’s simplicity, let us see how the example shown earlier is written in Javascript with tagmeme:

As you can see, we define the union “type” of Shape as a list of union cases. Then every case becomes a constructor for Shape after that we can pattern match against certain values of Shape .

To me, this looks just beautiful and it is very similar to what one would write in a language that implements union cases like F# or Elm.

Problems with using Tagmeme

Because this is Javascript we are talking about, it can be very error prone as we are bound to make one of the following mistakes that will cause an exception during runtime from tagmeme:

  • When forgetting to handle a case (or misspelling the case name)
  • When handling a case that wasn’t declared (handling too many cases)
  • When handling all cases and still providing a redundant wildcard argument that will never match (see docs)
  • When using the work “match” as a union case
  • When duplicating union case declarations
  • When misspelling union case name in value constructors

Even if we had a linter that checks whether the grammer of written Javascript code is correct, we could still hit runtime errors if we don’t correctly write exhaustive pattern matching against the sum types. Static analysis to the rescue!

Tagmeme-analyzer: static type-checker and analyzer for tagmeme

Because these are known variables where things could go wrong at “compile” time, I asked myself whether it was possible to write a program that checks if we made a silly mistake before actually running the application? Well this was my little experiment for the last week as I delved into the unknown lands of babel parsers to detect where mistakes could occur while providing meaningful alternatives if that is the case: introducing tagmeme-analyzer!

Implemented as a CLI tool, this analyzer runs against a javascript file, parses the AST and detects where the library is used incorrectly! Let us see this in action by writing some code that would fail at runtime, first we will define the union “types” in a seperate file:

Option and Result are declared correctly, but Numbers and Duplicates are not, the former has “match” as union case and the latter, well, has duplicate union cases. Here is the consuming code:

In the comments you see what is incorrect, now we need to run tagmeme-analyzer against app.js , let us install it first (for now as a global tool)

npm install -g tagmeme-analyzer

then run the analyzer and using the file as an argument, you will get:

As you can see, the analyzer detects all these problems we talked about earlier and if you run it from inside visual studio code, you can navigate to the line where the problem occurs.

How The Analyzer Works

Roughly speaking, the analyzer goes through the following steps

  • Parse the code and get an abstract syntax tree (AST) representation of the code
  • Find union type declarations from current and imported files
  • Traverse the tree and find places where match function is used with one of the declared union types
  • Start matching declared union cases with cases used in the match function
  • Create a list of errors and fuzzy-search for possible alternatives from the declation
  • Log the errors to the console

Two components of Babel are being used, the parser that generates the AST and tree visitor (a.k.a babel/traverse) that goes through the AST to find usages of match function.

The parser takes in the code that has to be analyzed and returns a tree structure of code with tagged nodes, for example from the code:

the part Option.Some('value') is converted to

Notice that the code became just data, the analyzer goes through this JSON tree, for example this is a snippet from the analyzer that detects the where the match function is being used:

this code looks for “call expressions” of which the property is being called as a member (member expression) of an object where the property’s name is “match” and has either two or three arguments and so on and so forth for the rest of the program, see all the code of analyzer.js here.

A word about the awesome Javascript ecosystem

Even though I am not the biggest fan of Javascript, the language, when it comes to available tools in the ecosystem, nothing can be beat it! Building the analyzer was a pleasant experience, very low friction when writing, testing and publishing the code as a CLI tool. Adding code coverage was as simple as putting nyc before the test script. Not to forget how easy debugging was from VS Code.

The End

I probably won’t be building anymore analyzers in the time being but it was great experience and it is nice to that it is all approachable when it comes to Javascript and the ecosystem.