Type System Savings

The one advantage dynamic typing pundits seem to bat around with little resistance is that types get in the way of their prototyping, or deciding how they plan to write a program. This cannot be true after the first handful of functions have been written; it is simply impossible for humans to track the explosion of possible function bodies dynamic typing allows.
I was fleshing out these ideas with my friend Louis van der Stam. We found the final numbers somewhat impressive…
Type Cardinality
Let’s first quickly brush over cardinality: let’s consider the type Boolean
has cardinality 2
and that the type Byte
has cardinality 256
. This is because there are 256
possible elements of type Byte… We will use the vertical bars like | to show this: |Byte| = 256
. But in an dynamic languages values have no clear type and lets use say any variable has an infinite possibility of values so it can hold |var| = ⍵
.
Validating values
Let’s pick a real-world example to continue elaborating on this topic and see some consequences of restricting how many values your code has to deal with, and your mind has to reason about… Let’s use validation as an example; we can choose to represent a validation as a function from some variable to some truthy response: A => Boolean
.
The cardinality of a function is basically the count of all possible implementations, which you can think of as a table mapping all possible inputs to all possible outputs. This happens to correspond to |Outputs|^|Inputs|
.
In a statically typed language; a validation of the kind mentioned earlier has a cardinality of |A => Boolean| = 2^|A|
which is basically choosing true
or false
for each possible incoming value of type A
into the function. On the other hand, in a dynamically typed world this function has a rather big space for its implementation |fun| = ⍵^⍵
which is a rather big number…
This in a ways demonstrates that for a dynamic language you are always forced to look at the body of a function alongside its name, just to confirm it does what you expect. There are effectively no further sources of trustworthy information that can corroborate what is happening inside a function’s body. You must read the code!
Compact encodings
If dynamic typing gives you functions that can in essence do anything, then why would you restrict the possibilities of intent. In a strict and pure language the generic function [A] A => Boolean
can basically only have two implementations _ => true
and _ => false
since there is nothing that can be done with the arbitrary input to produce a boolean value. This is extremely powerful! Everyone who has worked with a team or in a context where they can immediately know the implementation of the functions they are reading just from its enforced metadata, in other words: from just its type, then the developers on that team stop reading function bodies, or clicking through to the functions they are going to use, just to check it does what they expect. In practice, this is a huge time saver!
And that is just the beginning…
Addition over Multiplication
Let’s drive the point further by elaborating on our validation example. Let’s say we are validating some sort of Motor
type that can house a single engine, either an ElectricEngine
or a CombustionEngine
, along with some other features we will ignore (see a valid example in the code sample below). We can think of two ways to specify what engine a Motor
has. If we enforce the fact it can be an ElectricEngine
(let’s pick A
for short) or (+
) a CombustionEngine
(B
is easier to type). We can then write the cardinality of a motor as |Motor| = |A| + |B|
. Conversely on many common programming languages we have both properties and use null to say when it is one or the other |Motor2| = (1+|A|) × (1+|B|)
. I recognize I’ve glossed over how I came about how to calculate the cardinality of these composites, to go more in depth please look at this article on Algebraic Data Types.
Let’s pick two cardinalities for the electric and combustion engines: |A| = 5
and |B| = 8
respectively. After doing that we get the following table:

You may have experience that in the product case, where one engine is nullable, there are additional illegal states, e.g. marking both engines as null, or the 40
possible combinations where both engines are specified together instead of separately. This makes the total possible Motor
cases are just 13
instead of the 54
possible values for Motor2
. In our validation function we need to check that these extra illegal states can be guarded against with some form of null check; but how many possible versions of our validation function Motor2 => Boolean
are eliminated? and how do we know the implementer added these null checks? Are they unnecessary because the null checks are everywhere the motors can be constructed? These questions are moot if these states cannot be represented by the code in the first place. 1.8×10¹⁶
possible versions of our validate function disappear; just from disallowing illegal states!
If I’m allowed to re-use a generic functions I can shrink all the possible combination of functions I could potentially write to 289.
That is 289
possible ways to validate Motor
s vs 1801439851000000
ways for Motor2
. Which codebase is capable of housing more surprises? How many more places are there for bugs to hide when validating a Motor2
?
Here are equivalent examples given the initial choice of Motor
vs Motor2
:
I argue there is a net saving of effort every time I need to read the codebase for Motor
validation, and we know developers have been spending most of their time reading code. I believe when there is less to think about, and evaluate we win more time for actual writing!
by the author.