Published in

The Startup

Type System Savings

The one advantage dynamic typing pundits seem to bat around with little resistance is that types get in the way of their prototyping, or deciding how they plan to write a program. This cannot be true after the first handful of functions have been written; it is simply impossible for humans to track the explosion of possible function bodies dynamic typing allows.

I was fleshing out these ideas with my friend Louis van der Stam. We found the final numbers somewhat impressive…

Type Cardinality

Let’s first quickly brush over cardinality: let’s consider the type `Boolean` has cardinality `2` and that the type `Byte` has cardinality `256`. This is because there are `256`possible elements of type Byte… We will use the vertical bars like | to show this: `|Byte| = 256` . But in an dynamic languages values have no clear type and lets use say any variable has an infinite possibility of values so it can hold `|var| = ⍵`.

Validating values

Let’s pick a real-world example to continue elaborating on this topic and see some consequences of restricting how many values your code has to deal with, and your mind has to reason about… Let’s use validation as an example; we can choose to represent a validation as a function from some variable to some truthy response: `A => Boolean`.

The cardinality of a function is basically the count of all possible implementations, which you can think of as a table mapping all possible inputs to all possible outputs. This happens to correspond to `|Outputs|^|Inputs|`.

In a statically typed language; a validation of the kind mentioned earlier has a cardinality of `|A => Boolean| = 2^|A|` which is basically choosing `true` or `false` for each possible incoming value of type `A` into the function. On the other hand, in a dynamically typed world this function has a rather big space for its implementation `|fun| = ⍵^⍵` which is a rather big number…

This in a ways demonstrates that for a dynamic language you are always forced to look at the body of a function alongside its name, just to confirm it does what you expect. There are effectively no further sources of trustworthy information that can corroborate what is happening inside a function’s body. You must read the code!

Compact encodings

If dynamic typing gives you functions that can in essence do anything, then why would you restrict the possibilities of intent. In a strict and pure language the generic function `[A] A => Boolean` can basically only have two implementations `_ => true` and `_ => false` since there is nothing that can be done with the arbitrary input to produce a boolean value. This is extremely powerful! Everyone who has worked with a team or in a context where they can immediately know the implementation of the functions they are reading just from its enforced metadata, in other words: from just its type, then the developers on that team stop reading function bodies, or clicking through to the functions they are going to use, just to check it does what they expect. In practice, this is a huge time saver!

And that is just the beginning…

Let’s drive the point further by elaborating on our validation example. Let’s say we are validating some sort of `Motor` type that can house a single engine, either an `ElectricEngine` or a `CombustionEngine`, along with some other features we will ignore (see a valid example in the code sample below). We can think of two ways to specify what engine a `Motor` has. If we enforce the fact it can be an `ElectricEngine` (let’s pick `A` for short) or (`+`) a `CombustionEngine` (`B` is easier to type). We can then write the cardinality of a motor as `|Motor| = |A| + |B|`. Conversely on many common programming languages we have both properties and use null to say when it is one or the other `|Motor2| = (1+|A|) × (1+|B|)`. I recognize I’ve glossed over how I came about how to calculate the cardinality of these composites, to go more in depth please look at this article on Algebraic Data Types.

Let’s pick two cardinalities for the electric and combustion engines: `|A| = 5` and `|B| = 8` respectively. After doing that we get the following table:

You may have experience that in the product case, where one engine is nullable, there are additional illegal states, e.g. marking both engines as null, or the `40` possible combinations where both engines are specified together instead of separately. This makes the total possible `Motor` cases are just `13` instead of the `54` possible values for `Motor2`. In our validation function we need to check that these extra illegal states can be guarded against with some form of null check; but how many possible versions of our validation function `Motor2 => Boolean` are eliminated? and how do we know the implementer added these null checks? Are they unnecessary because the null checks are everywhere the motors can be constructed? These questions are moot if these states cannot be represented by the code in the first place. `1.8×10¹⁶` possible versions of our validate function disappear; just from disallowing illegal states!

If I’m allowed to re-use a generic functions I can shrink all the possible combination of functions I could potentially write to 289.

That is `289` possible ways to validate `Motor`s vs `1801439851000000` ways for `Motor2`. Which codebase is capable of housing more surprises? How many more places are there for bugs to hide when validating a `Motor2`?

Here are equivalent examples given the initial choice of `Motor` vs `Motor2`:

I argue there is a net saving of effort every time I need to read the codebase for `Motor` validation, and we know developers have been spending most of their time reading code. I believe when there is less to think about, and evaluate we win more time for actual writing!

by the author.

--

--

--

More from The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +756K followers.

Rodolfo Hansen

Constructive Programming Advocate. Looking for new ways to leverage the connection between categories, logic, language, and lambdas…