Private Type Brands in TypeScript

Tanner Engbretson
redox-techblog
8 min readJun 28, 2019

--

In my last post I explored the details of TypeScript’s structural type system — both the flexibility it lets us enjoy, as well as its shortcomings in ensuring the overall safety of our code. What these shortcomings fundamentally boiled down to was a matter of control. Who controls what may or may not be considered a value of a given type? With structural typing, types are defined in terms of primitives (strings, numbers, etc.) and the way they are structured or combined (arrays, objects, etc.). A programmer who wishes to create an instance of a given type need only supply and arrange primitives into the shape specified by that type. Since the programmer in this scenario has unrestricted access to creating and arranging these primitives, they are free to create instances of these types any way they see fit. Control over the primitives and their arrangement belongs to the designer of the type, but control over their values belongs to the implementer. Want your Vehicle type to guarantee an even number of wheels? Where’s your “even number” primitive? Want your User type to guarantee that the email field ends in '@gmail.com'? Where’s your “Gmail” primitive? In a structural type system, the designer of the type is left with no control over value-level constraints beyond what is already guaranteed by the primitives.

Let’s compare all of this to how nominal typing typically works. Languages with nominal type systems also have primitive values like numbers, strings, etc. However, they often restrict the combination of these primitives into richer types to classes or structs. Similar to structural typing, classes are defined by the arrangement of their component primitives. However, they have a very important differentiator: the constructor.

The constructor is a function that acts as the ultimate gate for what can or cannot be considered a value of a given type. Most languages require that for any class, you define one or more constructor functions that take some set of primitives as parameters and, depending on the wishes of the class’s designer, asserts the necessary conditions for their values. Lets implement the aforementioned User type with its restrictions on the Email field in C# to illustrate how this works.

Here you can see that our User type has defined an arrangement for our primitives: it has a string called Name, and it has a string called Email. However, there is more to it than that! Not only must it have that arrangement, but value-level guarantees, provided by the designer of this type, must be true in order for our value to be considered valid or even for it to exist at all. Control over the values that comprise our type belongs to the designer of that type.

The motivations for this level of control are not arbitrary or ego-driven. What we are trying to do is allow developers to make the strongest of guarantees about structured data so that consumers of that data are free to do useful things without constantly re-asserting its validity. How tiring and down-right inefficient would it be to have to assert that our string is a valid Gmail address every time we want to do something useful with the Email field? We would have to do all of this extra work just because we are limited by our ability to only convey the knowledge that it is a string across boundaries in code. Nominal typing allows us to attach stricter guarantees, and therefore richer meaning to the values that make up our types.

That all sounds great, so what’s the downside? Why would we ever use anything but nominal typing? Well it all comes down to flexibility. My last post described how structural typing lets us build behaviors that can specify the bare minimum of constraints required for its input data. This leaves users of this functionality free to shape their data however they see fit, provided it satisfies the structural requirements set out by the creators of the behaviors they want to use. Languages with nominal type systems typically fall flat in this area. Often, the only tool that developers are given to mix and match what existing behaviors we want to enable for our data is inheritance.

Inheritance is the ability for one class to take on and extend the structure and abilities of an existing base class. If a Chihuahua class is said to inherit from the Dog class we can be certain that all of the structural properties and behaviors for a Dog are present for our Chihuahua. That seems to get us pretty close to what we want, but in practice it ends up being restrictive or confusing to the point of frustration. In languages with single inheritance you are limited to having only one base class. When you’re defining a new type and you wish to enable the behaviors of two different base classes for it you have to either hope that one of these base classes is already a subclass of the other, or choose just one of them and miss out on some of the functionality you are after. In languages with multiple inheritance you do get the option to choose multiple base classes to inherit from. However, inheritance is a fundamentally hierarchical pattern, and with that comes the complication of combining behaviors from base classes that exist as different branches on the same hierarchical tree. This problem can be so ugly that it is referred to as the deadly diamond of death. I’ll pass on that, thanks.

This shouldn’t really come as much of a surprise. If you’ve ever gone beyond the canned examples like shapes or animals in a CS 102 class you’ve probably seen inheritance fail to model your domain in a near show-stopping manner. This is such a common issue with inheritance that experts mostly tell us not to even bother with it. And they’re right.

So where does that leave us? Do we have to make a choice between stricter, value-level guarantees and the ability to freely reuse functionality among related data structures? That’s not a trade-off I’m willing to make. If we want to find out how we might have our cake and eat it too we have to dig in and understand the foundational features of nominally typed languages that enable these behaviors and see if we can replicate them in a structural type system.

Let’s start by looking back at the constructor to see what’s going on. Like we discussed before, a constructor takes some combination of values or primitives, optionally asserts that the values meet some set of conditions, and returns an instance of its type. These three things can easily be accomplished in a structurally typed language by using a function and a type definition. Let’s see what that would look like with our Gmail example from earlier:

This obviously doesn’t get us what we’re looking for, but why is that? Our makeUser function checks all the boxes, but when it comes to actually guaranteeing an instance of the type we defined has met our criteria it falls flat. Since the type and our gate function are not bound together in a class definition, our type is able to be referenced and created without the approval of our gate. What makes matters worse is that even if we were able to obscure our type definition by keeping it private in a module or namespace, we still would not get the kind of safety we are after. We must remember that in a structurally typed language the structure is the type, which means that even without access to the named definition, a user can construct a compliant instance by arranging primitives into the requisite shape without regard for our gate function. What we have is a gate with no wall.

The line seems to be pretty clear — we cannot place value-level guarantees on our types as long as two things remain true:

1. Types are equivalent if their structures are equivalent.

2. The designer of a type and the user of a type have symmetric access to primitives and their means of combination.

Challenging fact #1 is a non-starter if we want to keep our structural type system, so let’s focus our attention on breaking the symmetry outlined in #2.

In the ECMAScript 2015 standard, a new type of primitive was introduced — the Symbol. There are many nuances to how Symbols work and how to use them, but for the purpose of this article we’ll focus on two specific aspects. First, every instance of a Symbol is unique, meaning if I create a Symbol, it is impossible for anyone else to ever create one that is equal to mine. Second, similar to strings, Symbols can be used as keys for assigning values onto an object. If we combine these two attributes we tap into some powerful use cases that Symbols enable.

Unlike strings, Symbols provide us a mechanism for reading and writing data to and from objects in locations only we know to look. With this additional primitive, we have the ability to define a value that adheres to a structure that only we know how to create. This is half of the equation. In order to get the other half, our type system needs to be able to model these structures as usable types that not only account for the presence of Symbols as keys, but also makes the same uniqueness guarantees that the runtime does. In version 2.7, TypeScript added support for declaring const-named properties on types.

This addition gives us everything we need to implement our value-level guarantees with structural types. Let’s see what that could look like.

In our first file, let’s define the gate function and the required shape for the type we want it to guard. In addition, let’s create a symbol that we will combine into our type definition. This will ensure that a valid instance of our type must have a boolean value true at the location of our gmail Symbol.

The important thing to notice here is that we are only exporting the gate function, Gmail and the type definition. Keeping the Symbol instance private is critical to this working. We now have a type, Gmail that can be imported and used anywhere, but an instance can only be created via the Gmail function we created. Now let’s see what this looks like when we put it into action.

As we can see, the unsafe sidestepping of our gate function has been prevented. We can now safely write functions like sendEmail that make clear their expectation to receive a valid Gmail address and have the type system enforce the fulfillment of that expectation. We can write sets of functions that presume value-level correctness and put the onus of providing correct data on the user of those functions. Furthermore, and most importantly, we have retained all of the flexibility we have come to expect and rely on from our structural type system! In this example our sendEmail function only requires its parameter to satisfy the Gmail constraint. This means any structure of data is allowed, whether it’s a User, or anything else. All that matters is that it has an email field, and that it passed validation through our Gmail gate function.

With this strategy we can safely enjoy the flexibility that a structural type system gives us while simultaneously letting the designers attach greater meaning to the structures they create.

--

--