My love/hate relationship with type systems

There are only two hard things in Computer Science: cache invalidation and naming things. – Phil Karlton

Act 2: We Found Love in a Hopeless Place

Last time we checked, I ranted about how much TypeScript type system is like teenage me: an awkward mess. I’ve also hinted that dynamically typed programming languages usually allowed a more expressive programming style. What I’ve yet to talk about is how the “there’s no free lunch. Ever.” rule applies in that context.

Typing Disciplines 101

Before we delve into the rest of this opinion piece, let’s clear up the air a little bit.

Dynamically typed programming languages are not untyped or weakly typed. What differs about types between let’s say JRuby (Ruby on the Java virtual machine) or Java is that types aren’t explicitly stated in the source code. Which means that the JRuby interpreter guarantees you that your program will run if it finds no syntax error. In contrast the Java compiler guarantees you that your program will run if it finds that all types annotations matches and if it finds no syntax error. It’s a bit like memory management. C/C++ used to say that everyone should manage their own memory (the modern C/C++ standard says that smart pointers should manage memory… most of the time) where Java says that an algorithm should do this. Two different take to get a similar results. And it’s still a programming hot topic like space vs tabs. Which is a bit dumb.

Since all modern programming languages have to interact in some capacity with a C/C++ runtime, they often inherit of the data types constraints and semantics imposed by the C/C++ standard and whatever the hardware and OS kernel folks implemented. In short, any new programming languages have to represent numbers, dates, strings, record data types, pointer references, function dispatch tables, branch statement, collections data types, and whatever else in a way that can be represented by the underlying runtime environment, which is, more often than not, a C/C++ style of memory allocation over a x86 or ARM instruction set. Like Rich Hickey once said:

Only the runtime matters

I See What You Mean

We often forget that a programming language is first and foremost a language. Something that is meant for a human to use. All that ceremony about classes hierarchies, tabs vs spaces, K&R vs Allman, and type systems is meaningless to the computer. All it sees are bytes placed in a peculiar order that it can understand. It’s also meaningless to your employer unless you can prove him/her that it has an added value but that’s another story… We humans somewhat care because it allows us to communicate with our team members what we meant to instruct to our robotic underlings.

In that regard, dynamically typed programming languages have this shortcoming: your teammate may not be able to read and quickly understand the code you write. Because of the awkward mess I’ve got to deal with on a daily basis compiles to JavaScript mainly by erasing the type annotations, I’ve come to look at type annotations more as comments about the code its attached to than actual constraints that are be enforced by the compiler. This enforces the notion of what type annotations, comments, and documentation are rarely perceived as: codebase metadata. As you may know, metadata is data that brings a better understanding about data. In other words, metadata adds context to data upon which we can relate other pieces of data.

The work we do as programmers is to bring a piece of data in context with another piece of data. All that context building is obviously useful for our clients. But sadly our industry often fail to recognize that process is also useful for us. This is among the reasons why I feel that a software company that doesn’t actively encourage code documentation or doesn’t encourage their technical staff to actively share knowledge does itself a disservice in the long term. Source code is a liability that is only amortized by the presence of people who are able manage it. And by people I mean programmers. I’ve heard, but cannot prove, that they have been projects that failed because a developer that was the most knowledgeable about a critical part of a codebase had left for greener pastures.

I am not saying that statically typed languages actively help to prevent a project management cluster fuck. But for us the architects below, it somewhat eases the pain when those happen. At the cost of being slightly less expressive, they often have somewhat of an improved legibility over dynamically typed programming languages. I say somewhat, because statically typed programming languages really shine when their type systems often requires the equivalent of a PH.D or a truck load of deductive reasoning.

The Intellectual Ascension to Dependent Types

Earlier, I highlighted that types are metadata about code. While most languages are happy to just encode memory allocation sizes with their types, some languages take this a few steps further. By a few steps I mean that they allow you to constraint a type to a range of values. With a fair amount of pattern matching, languages like Agda or Idris are able provide a limited form of program verification via the type system. I’m still exploring this idea or at least the implications of it.

Show your support

Clapping shows how much you appreciated Philippe Lafontaine’s story.