Creating a type-safe DSL for filtering in Typescript

Reid Evans
5 min readFeb 25, 2020

--

Photo by Devin Avery on Unsplash

It is common to think of a filter as a function from A -> boolean. Given a value of some type A, return a boolean. The filter function on the Array prototype works exactly this way. It takes a function and applies it to each item returning a new array with the items where the function returned true [1,2,3].filter(x => x < 2) returns [1] because 1 is the only value to satisfy the predicate (function that returns a boolean).

For in memory filtering this approach works well, but what would happen if we wanted to pass a complex filter description to another computer? What if we wanted to store them somewhere? What if we wanted to log every filter that was used? We wouldn’t be able to do any of those things very well because while functions are values, they don’t serialize well.

In this post we’ll look at creating a Domain Specific Language within Typescript which will allow us to separate our intent from our implementation.

Defining our recursive data structure

It all starts with a recursive data structure defining each of our core operations as data. Typescript supports sum types and they’re a perfect fit because we want to have a small core set of operations that can be combined to create more complex filtering capabilities.

In this case we’ve defined 5 core capabilities we’ll need. The Filter type has a type parameter which means that a given Filter can be associated with a single type. Filter<{ firstname: string}> is distinct from Filter<{ age: number}>, and this definition allows us to keep them distinct as well as verify that the val property on Equals has the type expected of the property passed for field.

The first 3 capabilities deal with comparison, while the final two are recursive definitions (they produce a new Filter<A> by combining 2 other Filter<A> values).

Providing helper functions

By default, typescript doesn’t automatically create constructor functions for each case in a sum type that some other languages (Haskell, Purescript, F#, Elm, etc) do. Creating these is a bit tedious, but will make creating Filter<A> values much easier.

Each case of our Filter<A> sum type has a related function, and each function returns a Filter<A> value.

Creating our first filter

With our functions in place we’re ready to create our first filter.

In this case our filter describes that we want any JobPosting where the manager is Bob Slydell and the salary is greater than 50,000.

Note that because we’ve defined Filter with a type parameter A and forced the val parameter to be A[keyof A] that we are able to hand create a Filter where the val is any of the types of any of the properties from the type A (string or number in our JobPosting type because manager is a string and salary is a number). In our created constructor functions though we’re constraining that the type of the values must match the type of the property defined because of the <A, K extends keyof A> bit (this is often referred to as a Smart Constructor). If we were to place that type constraint up at the Filter level we wouldn’t be able to combine filters that referenced different properties on our A type.. (thanks to gcanti for pointing out my mistake in an earlier version of this article)

If we were to log the value of our first filter, we’d see that it’s just a json object.

Interpreting our DSL

Because we have a recursive data structure, we need to write a recursive function to deal with interpreting its values. In this example we’ll create a simple recursive function that generates TSql code from the data structure.

One nice thing about Typescript is that it has a compiler option --strictNullChecks which when used with a function that returns a value forces us to provide exhaustive checking on switch statements.

For our first three cases we can simply use the field and val properties to return a string. In our final 2 cases though we could have recursively nested code again so we must recursively call our interpretToSql function, passing both parameters (a and b).

If we interpret our filter from earlier with this function we get the following:

We could also modify our interpreter to generate parameters which are then applied to remove sql injection concerns. We could parse the structure and then log the properties we’re using in the filter so that someone could make better decisions about indexing opportunities. We could do whatever we want in an interpret function because we’re not changing our underlying DSL or any of the values created by it. Further, our interpreting functions never have to change as long as we keep our core dsl the same.

Extending our language without breaking our interpreters

While the 5 instructions our DSL allows are powerful, it will be quite verbose to write some filters with just those 5. What if we wanted something like Sql’s in function? We’d have to chain a bunch of nested or functions together which would get ugly quickly. While we could add a new case to our Filter<A> dsl, doing so would force us to rewrite all of our interpreters which also isn’t ideal.

Because we defined Filter<A> as a recursive data structure we can compose Filter<A> values into new Filter<A> values and so we can easily create new functions that simply compose existing language features.

Because we haven’t altered the core Filter<A> every interpreter we’ve written will automatically work for all of these because they’re simple combinations of existing features.

Wrapping up

I really like this approach to solving problems because it separates the intent from the implementation and it allows the composition of simple things to make larger things. That said, there are plenty of times where this approach would be overkill so make sure to weigh the pros and cons before blindly implementing this approach.

Pros

  • New behavior can be defined anywhere by composing core language features
  • All data is immutable
  • Any program written in our DSL is just data that can be interpreted in a variety of ways
  • The interpretations can use additional effects (we could feasibly write an interpreter that had to issue async requests, could fail, or require logging at each step in our recursive tree without having to change our DSL or any program written in it)

Cons

  • It takes some code to get to this approach. More code can mean more bugs
  • JS doesn’t support tail call optimization so with a large enough structure you could run into stack overflow situations.

--

--