Better JS Cases with Sum Types

Gabriel Lebec

Published in

Fullstack Academy

28 min readFeb 20, 2018

Improving Semantics and Correctness · Illustrated via Redux State

Abstract

JavaScript has built-in atomic scalar types, such as numbers and booleans. It can also represent composite product types via arrays or record types via objects. However, it lacks an immediate solution for disjoint sum types. Sum types (a.k.a. tagged or discriminated unions) are a common tool in other languages; they allow a value to be one of a set of explicit cases, with easy and safe identification and data extraction. This article uses Redux state design to demonstrate common difficulties JS developers face when modeling a domain, shows how sum types mitigate those difficulties, and reviews a few libraries aiming to port sum types into the language.

Background: Redux State

Redux.js is “a predictable state container” inspired by the Elm Architecture. A developer can represent the canonical stateful data of their application in whatever form they wish, from a sophisticated Immutable.js Record to a straightforward POJO (Plain Old JavaScript Object). The state for an app which fetches and displays a list of adorable kittens might be as simple as:

const initialState = {
  kittens: [] // no kittens fetched yet 😿
}

Developers specify the state logic of a Redux-based app in a “reducer” function with the signature (oldState, action) -> newState.

function reducer (oldState, action) {
  if (action.type === 'GOT_KITTENS') {
    return { kittens: action.kittens } // replaces the kittens
  }
  return oldState // default, do nothing
}

Given an “action” object representing an application event, the reducer determines how to produce the new state.

newState = reducer(initialState, {
  type: 'GOT_KITTENS',
  kittens: ['Snuggles', 'Mittens']
})console.log(newState) // { kittens: ['Snuggles', 'Mittens'] } 😺

User interface code (e.g. React components) can subsequently read the state, creating a list of kittens.

const currentState = reduxStore.getState()
const listItems = currentState.kittens.map(kitten =>
  <li>{ kitten.name }</li>
)

Aside: if you have never used JSX before, the above might appear unsettling. This domain-specific language compiles to vanilla JS:

const currentState = reduxStore.getState()
const listItems = currentState.kittens.map(kitten =>
  React.createElement('li', null, kitten.name)
)

Motivating Example: Tackling Complexity

Initially, this direct representation of state works as expected. The application starts with no kittens. Mapping over the empty array state.kittens produces no list items, and our UI shows nothing. Later, when kitten data is fetched and the state is updated, our list will pop into view (assuming the rest of our AJAX / Redux / React code is wired correctly).

In practice, however, the user may be confused upon being shown a blank page. We really ought to let them know that the kittens are on their way:

const kittens = reduxStore.getState().kittensif (!kittens.length) { // kittens are loading?  return <p>Calling the kittens!</p>} else { // kittens received, show them  return (<ul>{
    kittens.map(kitten =>
      <li>{ kitten.name }</li>
    )
  }</ul>)}

Now when the user first loads the page, they see a lovely paragraph informing them of impending kittens (how exciting!). Later, the paragraph is replaced with a list of kitten names.

Signs of Trouble

One day, however, a user submits a help ticket. “When I visit the kittens page, it says they are loading forever.” What went wrong?

Well, on that day, the kittens data from the server was empty. That is, [] is a valid value representing the kittens in our database. We’ve hijacked the empty array to mean that the kittens are still loading, but that’s not necessarily true. We started with an initial state of 0 kittens:

{
  kittens: [] // intent: not yet fetched
}

And then ended up with a final state of 0 kittens:

{
  kittens: [] // intent: fetched (empty list from db)
}

These two states are indistinguishable, though we intended them to be distinct. Our mistake was conflating data (the array) with metadata (information regarding the array).

Falling Down the Rabbit Hole

Alice in Wonderland, first edition. Bodleian Libraries, University of Oxford. Lewis Carroll was an accomplished mathematician as well as author.

As intrepid JS developers, we may next try distinguishing status based not on length, but on type. What if we use null to indicate unloaded kitties?

const initialState = {
  kittens: null
}

Unbidden, a wild error appears.

Error: cannot read property 'length' of null

In a way, we got lucky this time — the code failed noisily and immediately. The problem, of course, is that null values cannot have properties, so our old UI code checking kittens.length is broken. The fix isn’t especially difficult:

const kittens = reduxStore.getState().kittensif (!kittens) { // kittens are loading  return <p>Calling the kittens!</p>} else if (!kittens.length) { // kittens are loaded but empty  return <p>Sorry, no kittens available.</p>} else { // kittens are loaded and can be shown  return (<ul>{
    kittens.map(kitten =>
      <li>{ kitten.name }</li>
    )
  }</ul>)}

That we have annotated the meaning of each case above should be considered a code smell; it reveals that our solution is not very semantic. Regardless, the unit tests pass, the app is deployed, and all seems well for a few days. Until…

When Zombies Attack

Error: cannot read property 'join' of null

Now what? Didn’t we solve this already? Ah, but this error is coming from another component:

const kittens = reduxStore.getState().kittensreturn <p>'Known kittens include: ' + kittens.join(' & ')</p>

Oops. Someone forgot, or was never informed, that kittens might sometimes be null. The oversight escaped attention for a while because nobody ever ran this part of the application before the kittens were fetched. It was a trap waiting for the right edge case to come along.

Over time, multiple failures like this one are added and/or discovered. As long as developers think of state.kittens semantically as a collection of kittens, they keep trying to use it as an array — even if sometimes it isn’t one.

Out of the Frying Pan…

Fast-forward, and as product requirements have multiplied, so have the variety of ad-hoc solutions attempting to wrangle our state. The user stories now specify that kittens need to have four visible representations: unloaded, loading, fetched (with data), and failed (with an error). Across the app, similar needs are being dealt with for state.puppies and state.bunnies. Some leaves of the state tree use the strings unloaded and loading instead of the single valuenull, but that breaks some falsy checks: if (!state.bunnies) is now a mistake. The state.puppies leaf is being replaced with an Error object in the case of HTTP failure, forcing the verbose and unreliable check if (state.puppies instanceof Error) in the consuming code. Meanwhile, state.kittens is an object with .collection array and .isError boolean, forcing a different handling pattern — if (!state.kittens.isError) return state.kittens.collection[0]. Or maybe someone decided that some value might be false or null, and each stands for something different… good luck keeping them straight. With a deluge of ad-hoc reinvented cases to keep track of, developers are frequently forgetting to handle some of them, especially as the representations are inconsistent and inexpressive.

“Many languages have trouble expressing data with weird shapes. They give you a small set of built-in types, and you have to represent everything with them. So you often find yourself using null or booleans or strings to encode details in a way that is quite error prone.” — Elm Language, Union Types Documentation

JavaScript developers often consider tackling these burdens to be a normal part of writing code. There is a common assumption that one simply must be more careful not to make errors like those detailed above. Losing type safety is the tradeoff one accepts in return for JS’s flexibility, right? We see examples of this all the time, especially when searching for data:

Array.prototype.indexOf returns -1 for “not found.” Not exactly clear.
Worse, Array.prototype.find returns undefined for “not found;” how can we distinguish between “not found” and “found undefined?”
Sequelize’s findById method returns a promise for null when database row is not found. It’s important to remember that might be the case, or else .then(user => user.name) could one day throw an error.

Programming paradigms that put the onus on human beings to be perfect inevitably result in confusion and mistakes. There must be a better way.

Sum Types, Defined in (Informal) Theory

The problems examined so far can be boiled down to two essential categories:

No consistent and expressive way to represent separate data status cases. “Not found” ought to be clearly different from “found, with this data.”
A high likelihood of error when consuming data that could be in one of several forms (i.e. polymorphic). Humans focus on the ordinary case and use variables assuming they will have certain properties or methods, forgetting that the actual value may sometimes be an exceptional case.

Sum types can improve both of the above. To understand sums, it helps to first examine scalar and product types.

Scalars

Two example scalar types. There are three scalar values of type A and two scalar values of type B.

A scalar type contains atomic values: single items which cannot be decomposed into parts. Examples of scalars in JavaScript include built-in types like Boolean, Number, and Null. For instance, the number 42 in JavaScript is a single value inhabiting the Number scalar.

A scalar type has an intrinsic size (a.k.a. cardinality), which is just the number of values inhabiting that type.

The Boolean type has a size of 2 as it contains two values, true & false.
The Number type is of “infinite” size (if we ignore the limits of IEEE 754 ).
The Null type has a size of 1 as it contains only one value, null.

Products

A × B is a product of two different scalar types A and B. The size of A × B is the size of A times the size of B. Each value in A × B has both an A component and a B component.

Product types are composite — they consist of multiple (possibly different, possibly identical) types, grouped together in an “and” fashion. For example, a type whose values consist of two Booleans (one Bool and one Bool) is a product. In JavaScript, we can represent custom product types using arrays as tuples, with position serving to distinguish each member type.

// FoodFacts are tuples of 2 bools: [isYummy, isHealthy]const saladFacts   = [true, true]
const burgerFacts  = [true, false]
const vitaminFacts = [false, true]
const chalkFacts   = [false, false]

Why “product?” Because the size of this new composite FoodFacts type is determined by multiplying the sizes of its constituent types. We can easily see above that there are only four different values in the type, obtained by multiplying 2 options for the first Bool × 2 options for the second Bool.

Positional notation (e.g. chalkFacts[0]) is not very clear with respect to meaning. A more expressive way to represent multiple values grouped together in JS is with objects, which can label each member value. Technically the labels make objects record types rather than products, but we will overlook that in this article, in the interest of making it easier to write examples:

// Person  = { name: String, age: Number, employed: Boolean }const mark = { name: 'Mark', age: 67, employed: false }
const jin  = { name: 'Jin',  age: 34, employed: true }
const ford = { name: 'Ford', age: 19, employed: true }
const sian = { name: 'Sian', age: 24, employed: false }
...

The size of the Person type, ignoring the labels, is Infinity × Infinity × 2. That is, the infinite number of Strings, times the infinite number of Numbers, times the two possible Booleans. Clearly, we will not be able to list every possible value in the Person type.

Sums

A + B is a sum of two different scalar types A and B. The size of A + B is the size of A plus the size of B. Each value in A + B is either an A value or a B value.

If scalar type size is an intrinsic number, and the size of a product type is the product of its constituent type sizes, you will probably not be surprised to hear that the size of a sum type is the sum of its constituent type sizes. Sum types are composite like product types, but in an “or” fashion; a single value in the type is only ever one of the constituent types, not a grouping of them all.

// FinitePrimitive = Boolean | Null | Undefinedconst finitePrimitive1 = true
const finitePrimitive2 = false
const finitePrimitive3 = null
const finitePrimitive4 = undefined

The FinitePrimitive type we define above has a size of 2 + 1 + 1 = 4. That is, both of the Booleans, plus the single number of Nulls, plus the single number of Undefineds. We can easily list out all four values, which we have done above. Notice, a value in this type is only one of the constituent types.

As another example, consider a sum type composed of some larger types:

// InfinitePrimitive = String | Number | Symbolconst infinitePrimitive1 = 'hello'
const infinitePrimitive2 = 'goodbye'
const infinitePrimitive3 = 42
const infinitePrimitive4 = Symbol('hmm')
const infinitePrimitive5 = 314159
const infinitePrimitive6 = 'ok we get it, there are a lot of these'
...

The InfinitePrimitive type has a size of Infinity + Infinity + Infinity. It can be any one of the infinite strings, or one of the infinite numbers, or one of the infinite symbols.

A Sum of Products

A + (B × B) is a sum of a scalar type and a product type. The size of A + (B × B) is the size of A, plus the size of B times the size of B. Each value in A + (B × B) is either an A value or a B × B value. B × B values have both a B component and another B component.

Products and sums can consist of scalars, but we didn’t define them as needing to consist of scalar types — on the contrary, the constituent types may themselves be products and/or sums. Here is a (non-JavaScript) sum of both scalar and product types:

scalar type Ghost (contains one value, `ghost`)product type Character { (contains 2 × 2 = 4 possible values)
  afraidOfNoGhosts: Boolean,
  ghostbuster: Boolean
}sum type Entity = Ghost | Character

How many values does the Entity type have? Well, it’s 1 possible Ghost value + (2 × 2) possible Character values = 5 different Entity values. Let’s enumerate them:

entity1 = Ghost ghost
entity2 = Character { afraidOfNoGhosts: true, ghostbuster: true }
entity3 = Character { afraidOfNoGhosts: false, ghostbuster: true }
entity4 = Character { afraidOfNoGhosts: true, ghostbuster: false }
entity5 = Character { afraidOfNoGhosts: false, ghostbuster: false }

Tag, You’re It

https://www.tidydesign.com/blog/2012/09/free-paper-tag-image/

We’ve almost finished defining sum types, but we’re missing a crucial characteristic which distinguishes them from the (very similar) union type. Suppose we define a sum type for name parts as being either a first name or a last name, where each is a string:

// NamePart can be a first name string OR a last name string
sum type NamePart = String | StringnamePart1 = 'Wilson' // is this a first or last name?
namePart2 = 'Ashley' // is this a first of last name?

When we encounter a value in the wild, the fact that we know its type is String isn’t quite enough to know whether it was supposed to be from the first choice of NamePart strings, or the second choice.

For that, we need to somehow label the value — with a “tag.” The value of interest will not consist of just the string on its own, but also be accompanied by a symbolic identifier that allows the developer to know unambiguously which of the constituent types it belongs to:

namePart1 = <LastName 'Wilson'>
namePart2 = <FirstName 'Ashley'>

Ah, now we know exactly what roles 'Ashley' and 'Wilson' each play.

“This overall operation is called disjoint union. Basically, it can be summarized as ‘a union, but each element remembers what set it came from’.” — Waleed Khan, Union vs Sum Types

If you think about it, the combination of a tag and some data is itself a product, meaning we can reframe our example sum type as a sum of products, where every product includes a tag:

sum type NamePart = (FirstName & String) | (LastName & String)

Since each tag is only one value, it doesn’t affect the size of the sum type. NamePart now has a size of (1 × Infinity) + (1 × Infinity), equivalent to its earlier size of Infinity + Infinity. Tags are therefore unit types.

With tags, we can now discriminate between otherwise identical values of a given type; the tag is a minimal form of metadata. There is a preponderance of synonyms for the concept of sum types: discriminated unions, tagged unions, disjoint unions, choice types, variants, etc.

Sum types and product types, both being composite, are also known as algebraic data types.

Sum Types, Defined in Practice

We now have a loose theoretical understanding of what a sum type is, but what does one look like in actual code? For that we can turn to myriad typed languages which implement sum types as a native feature. Though they have been supported at least as far back as Algol68, using sum types as building blocks is especially important to many functional programming languages including Haskell, ReasonML, F#, Elm, and Rust.

Let’s compare and contrast an identical sum type across a couple different languages. The stock example, used in resources like Exploring ReasonML and Functional Languages and Learn You a Haskell for Great Good, is Shape:

A Shape is a sum type, consisting of either a Circle or a Rectangle.
Circle tags a product type, consisting of a Point (the center location) and a Float (the radius).
Rectangle also tags a product type, consisting of a Point (one corner location) and another Point (the other corner).
A Point is a product of two Floats (x and y coordinates).
A Float is a scalar type; we will use IEEE 754 double precision floats as supplied by each language.

Shapes in ReasonML

ReasonML is an impure functional, eagerly evaluated, strongly typed language with type inference. It is essentially a JavaScript-like syntax for OCaml, which can compile to native code, JavaScript or even back to OCaml. It’s a great way for a JS native to start moving towards coding in a typed and more functional style. Sum types are referred to in ReasonML as variants.

“Behold, the crown jewel of Reason data structures!” —ReasonML Language, Variant Documentation

It should be emphasized that we are defining shape, Circle, Rectangle, and point here. The only built-in datatype we used was float. The sum type is shape, which can be a Circle or a Rectangle. We also construct a couple of example shape instances using each member tag.

Perhaps confusingly, Circle can be considered a tag, a constructor function, and a type; it has properties of all three. It is used to identify a value as being from one of the shape types; that makes it a tag. It is used to create values (like circ1) in the language; that makes it a constructor. And those values consist of two other types point and float grouped together; that is a product type. We can either use Circle as the name of that product of two types, or considering the implementation details, we could say that Circle is a value of the unit type within a product of three types. Whew!

On a side note, you might have noticed at the bottom of the code snippet that ReasonML ends up representing product types (like Point) as simple arrays in JavaScript, exactly as we showed in our earlier examination of products.

Shapes in OCaml

ReasonML can not only compile to JS but to OCaml, making it easy to see how we would define the same entities in that language.

OCaml’s type definition syntax, freed from the burden of seeming familiar to JavaScript developers, hews much closer to the purely theoretical concept of sum and product types — even to the point of using *.

The subsequent construction of instances is quite noisy with parens, on the other hand, making it clear why OCaml is sometimes called “LISP with types”:

Shapes in Haskell

Haskell is a purely functional, lazily evaluated, strongly typed language with type inference. Haskell has a heavy theoretical focus, though it is also used for practical purposes; it is very powerful yet exceptionally terse.

Again, the only built-in datatype we used was Float. Using :type in the GHCi REPL dutifully reports that yes, each example value has the type Shape. Had we added deriving (Show) to Shape, we could also log out circ1 in the REPL, which would display Circle (Point 2.0 3.0) 6.5. In short, our values “know” what types they come from and which tags they have.

Shapes in Rust

OK, one last example. Many of the languages that emphasize the use of sum types are dialects of, or were heavily influenced by, the ML family of languages: ReasonML / OCaml, Standard ML, F# etc. It’s difficult to find dramatically different syntaxes for sums, which makes evolutionary sense.

Rust is a nice example as it allows the developer to define product types (as structs) and sum types (as enums) using positional or record-style notation. This isn’t unique to Rust — Haskell can do the same, for instance — but Rust’s syntax demonstrates yet more names for types. Note also how constructing instances requires using the sum type as a namespace (Shape::Circle):

Don’t mistake the Rust enum for an ANSI C-style enum. The classic enumerated type is just a set of source code aliases for integers. True sum types allow the “arms” of the union to hold varied product types. Tagged unions can be implemented in C via a union of structs with an enum of tags and manual tag-checking code.

Shapes in JavaScript, Perhaps?

Having seen the same pattern encoded in a few languages, one might begin wondering how we can emulate it in JavaScript. The essence of sum types is that they consist of disjoint tagged types; the essence of tagging a type is to create a product of some identifier with the type. In JavaScript, a symbolic identifier with no other meaning should sound familiar; that’s a perfect use case for Symbol. And we already know some ways (arrays, objects) to loosely represent products in JavaScript; even if it cheats the theory a bit (again, records are technically not products), let’s use objects for clarity.

Oof. This concrete albeit unsophisticated first-pass attempt is not only much more laborious than the previous examples we saw, it is also incomplete. We do get some of the benefits correct:

Expressive construction using namespaced and named factory functions
Runtime type checks against constructor arguments
Ability to discriminate unambiguously between values based on tag

On the other hand, we are missing some features:

We have failed to truly create a sum type, as our rectangle and circle instances do not know they are shapes! This makes it difficult to specify that a function should receive any shape. We could manually write if (arg.tag !== CircleTag && arg.tag !== RectangleTag) in such a function, but that is brittle; what if we added a triangle case later?
This code pattern is itself difficult, lengthy, and error-prone to replicate in a manual, ad-hoc fashion. It is also still abusable by human developers who may try to read the center of a rectangle or corner2 of a circle.
Since JavaScript is dynamically typed, we only get failures at runtime. If a given function attempting to use a Shape is not immediately executed, we might not be aware that it contains a latent mistake.

Some of these points can be mitigated. Making values “remember” their types and tags together can be accomplished using a bit more code, and making it easier to define types in this fashion can be accomplished by abstracting the definition process to a function. Publish it as a library with a small API, and we’d be well on our way to capturing many of the benefits of sum types.

However, let’s not get ahead of ourselves. Before we see some practical real-world versions of JS sum types, we need to address the other half of the equation; not only constructing sum type instances, but consuming them.

The Payoff: Pattern Matching

https://jonsibal.deviantart.com/art/Superman-Secret-Origin-148465087

By now you may be impatiently wondering what the point of all this theoretical messing about is. Recall the two problems we identified earlier: lack of a semantic way to separate cases, and human errors when using polymorphic data.

Sum types definitely solve the first problem. With the addition of a constructor tag, now every value comes with its own metadata identifier explaining just what the data’s case is.

How do we address the second problem? Ah, this is where sum types shine: pattern matching.

Pattern matching is a language-supported syntax for doing two things at once: identifying which case a value represents, and extracting the data from such a value. It lets the user of a sum type easily, declaratively, and safely consume values in the type. It prevents the user from mishandling the values by inverting control; the user no longer preemptively tries to inspect (potentially nonexistent) properties of an unknown type, but instead provides all possible type-handling cases in order to produce a result.

Let’s see pattern matching on our shapes via an example area function, with the signature Shape -> Float.

Pattern Matching in ReasonML

A ReasonML function can directly specify that it takes a shape. How do we use the shape? By matching all the tags using switch:

When we call area at the end, passing in a circle or a rectangle, the switch will match the passed-in shape to the correct case. Note that there is no JS-style fall-through — no more than one case will ever match. We should provide cases for every tag; if we omit one, the compiler will actually warn us we’ve forgotten something!

File "", line 14, characters 2-90: Warning 8: this pattern-matching is not exhaustive. Here is an example of a value that is not matched: Rectangle (_, _)

Exhaustiveness means that the function will be able to handle every value it could accept. An exhaustive function is also known as a total function. Non-total (i.e. partial) functions throw runtime errors if applied to a value they cannot handle.

Going in the other direction and attempting something nonsensical, like adding Point to our cases, stops the compiler outright:

Line 22, 6: This variant pattern is expected to have type shape The constructor Point does not belong to type shape

“But wait, there’s more!” Not only do we get to declaratively match to a specific case, but we can then destructure the data from that case using whatever variable names we want. The second argument of Circle is a float used for a radius; in our switch we pattern match to Circle and bind the second argument as the variable radius. We can also use _ to specify a value that we want to ignore, e.g. the center point for the circle; the area of a circle only depends on the radius after all.

Each case is a function from the matched pattern (with bindings from destructuring) to a result. In a very compact way, our code above states:

if the passed-in shape is a circle,
consisting of <nobody cares> and a radius,
then return pi * radius²

Similar logic for the Rectangle case lets us destructure the x and y coordinates of each Point in the rectangle, and do the proper math.

Pattern matching in ReasonML can get more advanced; a common tool is a fallback case if no other pattern matches, which can be enumerated as _ => (your return value here). See here for other capabilities.

Pattern Matching in Haskell

Haskell also has a switch syntax, called case:

Calling area circ1 etc. in the REPL gives the expected results. Again, note that adding a nonsense case will cause the compiler to fail. Even more strict than ReasonML, omitting a case can also cause the compiler to fail, provided that you opt in to such behavior with the pragma {-# OPTIONS_GHC -Wincomplete-patterns -Werror #-}.

shape.hs:25:8: warning: [-Wincomplete-patterns]
    Pattern match(es) are non-exhaustive
    In a case alternative: Patterns not matched: (Circle _ _)

The popular LambdaCase language extension lets us to skip binding a name for the shape argument:

Haskell has other ways of defining area, including guards or simply writing each case as a separate equality:

Finally, Haskell allows a fallback case with otherwise.

Pattern Matching in Rust

Rust uses the somewhat more verb-oriented keyword match to perform pattern matching. Like when constructing, Rust also requires the developer to cite which sum type the tag comes from. When destructuring arguments from a record-style product type, it uses .. to ignore unused fields. Not shown, it can use _ to ignore positional fields and also to define a fallback case.

Again, omitting or exceeding the exhaustive cases causes the compiler to fail, informing the developer that something is wrong before any code is run.

error[E0004]: non-exhaustive patterns: `Rectangle(_, _)` not covered
  --> src/main.rs:10:18
   |
10 |   return match shape_arg {
   |                ^^^^^^^^^ pattern `Rectangle(_, _)` not covered

Sum Type Feature Wishlist

“Modelling data is important for a range of reasons. From performance to correctness to safety. Tagged unions give you a way of modelling choices that forces the correct handling of them, unlike predicate-based branching, such as the one used by if statements and other common control flow structures.” — Folktale Library, ADT Union Documentation

We’ve now seen sum types defined in theory, constructed in practice, and consumed in practice. If we were to list requirements for the ideal features of a language-supported sum type, they might read as follows.

Overall

📖 Feature a clean, expressive API
👮 Prevent naive direct inspection of data from a sum type value
✅ Enable easy type checking with an is or has function
💎 Luxury: enable extending sum types with methods
💎 Luxury: easy serialization / deserialization

Definition

🖋️ Define the name of the sum type
🗄️ Define the disjoint components of a sum type
📦 Allow each component to itself be another product, sum, or scalar type
🏷️ Label each component with a tag
👷 Have tags act as constructor functions, producing values from the type
📋 Enable tags to receive arguments, positionally and/or in record style
💎 Luxury: include integrated tools for working with product types

Matching

🔍 Enable matching a value to case by tag name
📜 Destructure component data during the match
➡️ Map destructured data to a return value for the match expression
❓ Allow for a catch-all fallback case in the match
🚨 Ideal: fail statically if possible (during compile-time) when omitting a required case or adding an incorrect case
🚨 Compromise: barring the above, fail immediately during runtime when omitting or adding a case
🚨 Worst: barring the above, fail eventually during runtime, upon discovering that a case cannot be matched

State of the Art: JavaScript Sum Type Libraries

Porting sum types into JS is hardly a new idea. A number of projects, some fairly prominent, have made efforts in this direction. Each implementation differs somewhat in terms of interface, capabilities, and guarantees. Below is a selection (far from exhaustive) of found examples, sorted by npm downloads per month:

Shapes in Folktale

Folktale is a general-purpose functional library with other tools besides adt/union. That inflates its numbers somewhat in the above table, but it also features one of the nicest (IMHO) discussions of tagged unions, so we’ll take a look at it. The embedded code below is editable, run it yourself.

Edit and run me!

Biggest knock: developers can directly access .radius off of a circle. This could tempt them to try and grab .radius from an unknown shape.

Lack of an integrated product type is slightly awkward but not a big deal.
Lack of mandatory declarative argument types is potentially dangerous. We can manually do the checking using static hasInstance, but that is a chore and relies on developers to dot their i’s and cross their t’s.
Using vanilla JS destructuring with aliasing makes the pattern matching quite similar to the ideal.
There is no exhaustiveness check or failure on extra cases yet, though it is an active issue (Folktale #138).
There is no fallback case yet, though it is an active issue (Folktale #139).

Folktale has other features not shown, such as extension via shared derivations (presumably inspired by Haskell). Overall, it definitely gets us much of the way towards the ideal, but could stand to have better type safety.

Shapes in Daggy

Daggy is part of the wider Fantasy-Land ecosystem of functional JS specs and tools. The documentation is minimal, just two barely-explained example snippets; nothing we cannot understand, however.

Edit and run me!

Biggest knock: developers can still directly access .radius off of a circle. This could tempt them to try and grab .radius from an unknown shape.
Including an integrated product type is a nice touch.
Only declaring field names is minimalist, but prevents us from even being able to perform type checking.
Using positional fields is more powerful than forcing destructuring, as you could always use a single field and destructure it if you wanted.
There is no exhaustiveness check or failure on extra cases, nor obvious plans to add either.
There is no fallback case, nor obvious plans to add one.

Daggy allows extension through idiomatic prototypal inheritance — adding a method to Shape.prototype will allow Circles and Rectangles to delegate to that method.

Shapes in Union-Type

Conveniently, shape is one of the demonstrated examples from this library’s documentation (adapted slightly here).

Edit and run me!

Biggest knock: developers can still directly access .radius off of a circle. This could tempt them to try and grab .radius from an unknown shape.
We are back to the slightly awkward Point.Point as there is no distinct product type.
We get all the benefits of declarative field names, plus validations, including declarative validations using both built-in and predefined types. Union-type does the type-checking part quite well.
Relatively seamless handling of arrays or objects in defining and constructing values for the type, albeit only positional arguments during a match.
There is no exhaustiveness check yet, though there is an issue for this (Union-Type #52).
No failure on extra cases, nor obvious plans to add either.
There is a fallback case, _, though it doesn’t help for shape per se.

Union-type is one of the most fully-featured examples at first glance. It includes prototypal inheritance, some fancy tools e.g. both instance and curried static case functions, variations like caseOn, recursive types, a fallback syntax and more.

Of note, one of the other libraries found (JAForbes/sum-type) was based on union-type, but with enhancements related to the sanctuary-def project.

So… Which Shall We Use?

This is just a small selection of libraries, and the comments above are not intended to make a final judgment as to viability, implementation details, and other concerns. Rather, the intent is to see how people have attempted to solve this issue to date, and consider how we would want such a library to work. We haven’t touched on potentially important features like serialization, for example. In short, given JavaScript’s dynamically-typed nature and variety of existing edge cases, any library attempting to implement sum types is likely to have some opinions baked in, and some issues left to iron out. For instance, none of the three libraries examined prevent developers from just directly attempting to grab a property from a sum type, even if not every member of the type has that property.

For the remainder of this article, I’ll be using union-type, mostly because it includes declarative type checks and a fallback case syntax.

Back to Redux

An age and a half ago, this article opened with an example JavaScript use case: a Redux state tree. If you recall, we were struggling to encode various states like “unloaded,” “loading,” “loaded with data,” and “failed with an error.” It should be clear that we could represent those states as a sum type.

If we had additional leaves in our state tree — state.bunnies, state.puppies, etc. — they would also be one of the four Leaf types. Our consuming code would use union-type’s case function to determine what UI to display.

It’s very likely that you’ve noticed something familiar about Redux’s action objects. Why… they’re just a tagged union themselves! An action can be one of a set number of objects, each with a type (tag), and each with potentially any number of other properties (making it a product type). In the reducer, we switch on each case based on type — #mindblown. Why not represent actions as a bona-fide sum type, then?

Unfortunately, Redux currently aggressively asserts that action objects be POJOs for now — no fancy library constructs. But the idea is sound, and in fact is precisely what inspired Redux in the first place. Redux is based on Elm, which includes tagged unions as a language feature. Recognizing that actions are members of a sum type, we have come full circle.

Redux Conclusion

So what have we gained? We are no longer reinventing the wheel, using primitive types to encode various mutually exclusive cases. And when we want to extract the data, we are reminded to handle every possible case. It’s not perfect — the type checks are limited, there are undoubtedly edge cases to consider, etc. And yet, the gain in both expressiveness, cleanliness, some degree of safety and more ought to be pretty appealing. Modeling your domain explicitly like this is a natural part of working with typed languages, and with all of JavaScript’s vaunted flexibility, it makes sense to adopt some of those benefits.

Case Studies (Pun Intended)

Redux state was a motivating example for this article, but sum types are so generally useful that it would be a shame not to go over some of their “greatest hits.” These are constructs so fundamental that they are often included in other language’s standard module. The following examples will be in pseudocode… see if you can implement them in a real language.

Maybe / Option

The Maybe type is either some data, or nothing at all.

sum type Maybe = Nothing | Just anythingfirstOfList = list =>
    if (!list.length) return Nothing
    else return Just(list[0])res = match firstOfList([])
    Nothing          => 'Sorry, there was nothing there.'
    Just (something) => 'Ah, found something:' + somethinglog(res) // 'Sorry, there was nothing there.'

No more -1, undefined, or null generating confusion. Methods like findIndex, find, and findById can now return a Maybe value; consuming code will then pattern match to decide what to do with the actual Nothing or Just something as the case may be. Try writing a safeDivide function which returns Nothing in case of divide by zero.

Cons List

This classic, recursive, closure-based, functional linked list is a workhorse of functional programming languages. Basically, a list is either the empty list Nil, or it is constructed from an element and a following list: Cons x xs.

sum type List = Nil | Cons anything ListmyList = Cons(1, Cons(2, Cons(3, Nil)))addList = someList =>
    match someList
        Nil         => 0
        Cons num xs => num + addList(xs)log(addList(myList)) // 6

Binary Tree

Sum types excel at processing tree structures, which is part of why they are very handy in writing compilers.

sum type Tree = Leaf | anything Tree TreemyTree = Tree(1,
    Tree(0, Leaf, Leaf), Tree(2,
         Tree(1.5, Leaf, Leaf), Tree(2.5, Leaf, Leaf)))addTree = someTree =>
    match someTree
        Leaf => 0
        Tree num left right => num + addTree(left) + addTree(right)addTree(myTree) // 7

Conclusion

When I started taking notes for this article, I told some friends that it would be “a quick blog post.” As it turns out, there was a lot more I wanted to say about sum types than I first realized. If you’ve made it this far, congratulations; I hope you enjoyed it and are interested to try using sum types in your next JavaScript project. The nature of the language may make the attempt imperfect, but an imperfect improvement is still an improvement. And with the possible future inclusion of native pattern matching, sum types in vanilla JS may become even more viable.

Resources

Here is a partial list of resources I found helpful in researching the subject, and can recommend for further reading.

Language Documentation

ReasonML: Variant! [sic]. See also ReasonMLHub: Variant
Haskell: Algebraic Data Type
Elm : Union Types
F#: Discriminated Unions
Rust: Enums

JS Library Documentation

Folktale: Union (particularly good doc-article hybrid)
Union-Type
Flow: Unions

Articles & Article Series

Waleed Khan, Union vs Sum Types. Excellent article, really helped me understand the difference between these concepts.
Gabriel Gonzalez, Sum Types. Examples in Haskell.
Scott Wlaschin, F# for Fun and Profit: Designing with Types. Goes much further into how types can underpin the entire logic of an application. Also check out his talk on the same subject.
Joel Burget, The Algebra (and Calculus!) of Algebraic Data Types. Goes more deeply into the theoretical and mathematical aspects of ADTs.

Wikipedia

Words of wisdom.

Better JS Cases with Sum Types

Abstract

Background: Redux State

Motivating Example: Tackling Complexity

Signs of Trouble

Falling Down the Rabbit Hole

When Zombies Attack

Out of the Frying Pan…

Sum Types, Defined in (Informal) Theory

Scalars

Products

Sums

A Sum of Products

Tag, You’re It

Sum Types, Defined in Practice

Shapes in ReasonML

Shapes in OCaml

Shapes in Haskell

Shapes in Rust

Shapes in JavaScript, Perhaps?

The Payoff: Pattern Matching

Pattern Matching in ReasonML

Pattern Matching in Haskell

Pattern Matching in Rust

Sum Type Feature Wishlist

Overall

Definition

Matching

State of the Art: JavaScript Sum Type Libraries

Shapes in Folktale

Shapes in Daggy

Shapes in Union-Type

So… Which Shall We Use?

Back to Redux

Redux Conclusion

Case Studies (Pun Intended)

Maybe / Option

Cons List

Binary Tree

Conclusion

Resources

Language Documentation

JS Library Documentation

Articles & Article Series

Wikipedia

Written by Gabriel Lebec