Take a step back in history with the archives of PragPub magazine. The Pragmatic Programmers hope you’ll find that learning about the past can help you make better decisions for the future.

FROM THE ARCHIVES OF PRAGPUB MAGAZINE SEPTEMBER 2012

Thinking Functionally with Haskell: Types? Tests? We Need a New Word

By Paul Callaghan

PragPub

Published in

The Pragmatic Programmers

42 min readApr 7, 2022

In which we explore what modern type systems bring to the table.

Imagine an approach to programming where you write down some description of what your code should do, then before running your code you run some automatic tool to see if the code matches the description. That’s test-driven development, you say!

Actually, this is what you are doing when you use static types in most languages too. Types are a description of the code’s inputs and outputs, and the check ensures that inputs and outputs match up and are used consistently. Modern type systems — such as in Haskell or above — are very flexible, and allow these descriptions to be quite detailed; plus they are not too obtrusive in use and often very helpful.

One point I’ll investigate here is how advances in types are converging with
new ideas on testing, to the point where (I claim) the old distinctions are starting to blur and starting to open up exciting new possibilities — hence my
suggestion that we need a new word to describe what we’re doing that is free from preconceptions and outdated thinking.

So put aside your bad experiences from Java, and prepare to be amazed!

Photo by Merve Sehirli Nasir on Unsplash

TL;DR

Apologies for length by the way — but it is a big topic and I wanted to give a high-level view of the whole area. Plus I’ve had to skimp on a few details to keep this introductory discussion readable. Later articles will fill in the gaps.

Here are the essential points, though:

Haskell’s type system is pretty flexible and mostly unobtrusive (unlike some static systems).
A good type system is a powerful design language, too.
We can also work with types in a “type-driven development” style.
Parametric polymorphism is about using the same code for everything.
Interfaces provide clean overloading and encourage and support approaches like DCI.
Often Haskell feels like a dynamically typed language, though with a good safety net in reserve.
Haskell has its limits, but the cost-benefit tradeoff is probably in our favor.
There are several options beyond Haskell, such as dependent types.
Dependently typed languages are an elegant combination of computation
and reasoning.
Many TDD specs can be encoded as (dependent) types, and vice versa.
I believe the distinction between tests and types is blurring, and each side can learn much from the other to capitalize on this.

What Are Types For?

Types do sometimes get some bad press. Many people will say types are about nailing details down and trying to avoid errors. This sounds to me a bit dry and negative, too much like a strait-jacket, too conservative (if you read the recent Yegge rant).

Some type systems undoubtedly are too restrictive and just don’t give enough return for the work required to use them (Java is a prime example) but I’m talking about Haskell’s type system and above here, which are quite a different kettle of fish. In this context, I prefer to see and use types as both a design language and a safety net. That is, they are more about help and opportunity rather than constraint, and this is pretty much how the type systems have been designed into these modern languages.

Design-wise, I can sketch some of the ideas in my mind as types, add some code, and have the compiler tell me when I’m missing some details or trying something silly. Quite often, I only write down types for the hard stuff when I want to be sure about it, and I let the compiler work out the rest. In this mode, the type checker is definitely the slave and not the master! When my code gets past the type-checker, this gives high confidence that the code isn’t going to have problems at run-time, at least not in the aspects described via the types. It’s a highly valuable tool in development, with obvious similarities to TDD. Many functional programmers even refer to type-directed development (and observe that we’ve been doing it for a while…)

Type systems vary a lot, from minimal (like C) to cumbersome (like Java) to flexible (like Haskell) to awesome (like dependent type theory). The differences involve how articulate the descriptions (i.e. types) can be, and how much reasoning (or inference) power can be applied. That is, how much useful stuff we can say directly in the types, rather than in separate tests or comments or documentation, and how much the tools can exploit this richer knowledge. The more advanced systems offer new ways of using types as well, such as semantic-directed editing and increased automation of the routine aspects of programming.

A Concrete Example: Sorting

Let’s look at various types for a sort routine/method/function. We’ll look at how this is handled in several languages and consider what the type is saying. Obviously these library routines are reliable, etc. — the point here is what the types can tell us or help us do.

Ruby, etc. don’t say much, maybe just to indicate what can be sorted by virtue of some object being able to respond to a method called sort, and there’s a general expectation (though no requirement) that sorting will return an Array. We will, however, get run-time exceptions if the sorting can’t proceed, e.g. with [2, “a”].sort it won’t permit comparisons of Fixnum and String, or my personal favourite gotcha, [[], nil].sort.
C allows something like int sort(int *vals, int num_vals) for sorting an array of ints in place. Typically, we want to generalize our sort routine to work with arbitrary types, so will find in stdlib.h something like void qsort(void*vals, int num_vals, int size_vals, int comp_fn(void *, void *)), which allows in-place sorting of an array of items by using a supplied comparison function.
Java has several options, but the basic pattern is this for general sorting void sort(T[] a, Comparator<T> c) which allows sorting of an array of some underlying element type T, using some comparison function on T values, wrapped up in a comparator object. The T, etc. is a placeholder in Java’s so-called “generics” mechanism, and gets instantiated when the above code is applied to a concrete array. One effect of the generics is to guarantee that the comparison function is suitable for use on the element types. Earlier versions of Java provided a method sort(Object[] a, Comparator c) — this would allow sort to be called on an array containing anything. Notice that the generics version moves some of the expensive run-time checking to a single check at compile time, and provides a bit more confidence that the resulting array hasn’t got funny data added.
Haskell’s version is sort :: Ord a => [a] -> [a], which basically says “given a type ‘a’ for which ordering is defined, then we can sort a list of such ‘a’s to provide another list of the same element type.” It says a pretty much what the Java generics version does, albeit (IMHO) more succinctly and with no OO overhead. We could replace the Ord a => part with an explicit function argument, to give (a -> a -> Bool) => [a] -> [a], i.e. we require a function that takes two arguments and indicates whether a swap of elements is needed. (The Ord a stuff is effectively wrapping up a particular comparison function; we’ll come back to this.)

But consider now, what is missing? Nothing? We’re saying something about the data structures expected as input, and about the output type, plus ensuring that the ordering test can be used with the data. However, it still can’t rule out bad definitions like sort xs = [] (always return an empty list) or sort xs = reverse xs (return a reversed copy of the list). Do we want to rule these out?

If we’re sensible, then we’ll probably sketch out some tests, maybe boiling down to checking assertions like sort [] == [] (empty list case) or sort [2,1] == [1,2] or sort [2,1,2] == [1,2,2]. We can even define some properties using the QuickCheck library and run automatic checks on a few hundred random examples to try to uncover counter-examples. Perhaps we’ll look at the code and identify some edge cases that need particular attention (depending on the algorithm), or want to check that the implementation has the stability property, i.e. sort_by first [[1,”b”],[1,”a”]] == [[1,”b”], [1,”a”]] and similar cases.

Do these have to be encoded as separate tests? Why can’t we move some of this thinking into the code itself?

It turns out, we can! and in several ways. Possibilities include “Design by Contract” or some extensions to Haskell. However, I’m going to focus on a more general approach that subsumes all of the others. It is based on “dependent type theory,” and effectively it’s about merging types and programs into a single language. With this advanced language, we have the tools to encode our tests inside our program in a sensible way.

We can also encode conditions or invariants that are hard to express as tests — such as requiring that the code will never lose elements or duplicate any, and that indeed it does always return a sorted result.

First, we need to understand how Haskell’s type system works, and see how it can be used for good in programming.

The Language of Haskell’s Type System

I suggested that one important feature for a type system is how much it allows us to describe, so we’ll start looking at the language of Haskell types first. It’s perfectly accurate and appropriate to describe a type system as a language — it has words, grammar rules, meanings. Languages differ in what they can be used to express (or articulate), by virtue of the forms of “sentence” available.

Propositional logic can only talk about atomic propositions, like “Rubies are red,” and combinations of these via the logical operators, e.g. “Rubies are red and not(Rails is written in C).” Predicate logic allows some more flexibility by allowing the ability to quantify over entities, hence enabling sentences like “For all m, m is mortal -> father_of(m) likes Rails” (ie, every mortal’s father likes Rails — probably untrue but we’re not talking truth here, just talking about what we can articulate). In this example, we speak of the variable m being bound by the for-all quantifier. We can have many other kinds of logic, e.g. modal logic where we talk about possibility vs necessity, or 2nd order logic where we can quantify over properties of elements, or — well, whatever we want.

Simple type systems (e.g., Pascal) can be like propositional logic, only able to say basic things, such as “this method takes two Ints and returns a String.” Typically, we’d express this as (Int, Int) -> String, using the thin arrow symbol as a kind of implication — in the sense of “if you give me two Ints then you get a String back.”

Programmers can add new type names, e.g. a Person record to contain name, age and address, and use these type names among the ones supplied in the language.

C allows references to functions to be passed around, and these can be represented using nesting, e.g. (Int, Int -> Int) -> String for taking an Int and a function (from Int to Int) and returning a String. This is OK for basic work, but clearly more flexibility is needed.

Haskell’s core type system is a restricted form of predicate logic. It allows variables in the type expressions to stand for arbitrary types, enabling a feature called parametric polymorphism. This kind of polymorphism (for there are several) is about using the same piece of code whatever the actual input types. For example, taking the length of a list has type [a] -> Int, and can be computed without considering the type of list elements: all it needs to do is walk through the list structure counting the nodes. It’s completely irrelevant what is in the list, and no choice of element type can affect how the counting works! Similarly, list reversal has type [a] -> [a]: again, it is parametrically polymorphic and able to work with any list. Notice that type variable ‘a’ appears twice: this means that the input type is the same as the output list, so it can never (e.g.) convert a list of Ints to a list of Strings.

Stepping back a bit, this “it works for everything, and works consistently” view is often precisely what we want to articulate, and the benefits for checking code should be clear. For example, the mapping operation on lists has type (a -> b) -> [a] -> [b]meaning in English, “give me a function from some a to some b, and a list of some ‘a’s, and you’ll get back a list of some ‘b’s. Notice how it says something useful about what map does, that we no longer need to explain in the documentation? Furthermore, when we use map, the compiler can check automatically that its use is appropriate.

Similarly, function composition f . g = \x -> f (g x) has type (b -> c) -> (a -> b) -> a -> c, translated as “if we can convert ‘b’s to ‘c’s and ‘a’s to ‘b’s, then we can convert ‘a’s to ‘c’s, for whatever choice of a,b,c.” Again, the type clearly says what the requirements are and indicates what we can expect back.

We can use polymorphism when defining own data types too, e.g. data SamePair a = MkSamePair a ato have a constructor that combines two values of the same type. You saw some other examples last time. Let’s now look at how the type language is used.

Type Inference and Type Checking

A language by itself isn’t much use — we need some reasoning rules and algorithms to make it useful. That’s what we’ll cover here. The core inference rule is actually quite simple: if you have a function f that expects some value of type A, and you have a value x of type C such that A and C match in some way, then f(x) is firstly OK and also it has type B.

f : A -> B     x : C-------------------- if A = C       f(x) : B

Notice that this rule doesn’t “do” anything — it just says how we can relate bits of information. So next we can think about the two key algorithms: type inference and type checking.

Inference takes a piece of code and works out what the type could be, and most versions walk through a syntax tree and apply the above rule from the bottom up.

Checking takes a piece of code and a type, and decides whether the type is appropriate. There are several techniques we can use, such as using the code alone to infer a second type and seeing if it matches the original, or a top-down tree walk that basically uses the above rule in reverse (which is also great for inferring omitted type information).

Many compilers (quite reasonably) use a mixture of inference and checking when compiling the code, so don’t get too hung up on the differences. Just be aware of the options, and that we have several techniques to work with the types in our code.

Extra features like type variables fit easily into this framework, with the main change being that our A = C test might need to do unification (just as in Prolog) on two expressions, and anything learned from the unification needs to be applied to B. For example, where reverse :: [a] -> [a] and foo :: [Int], then A is [a] and C is [Int], which matches with the substitution a := Int. So we can conclude reverse foo :: [Int] because the B part is updated from [a] to [Int] because of the substitution. If you’re happy with this example, then you’ve just understood the heart of Haskell’s type checking. (If you want an interesting challenge, see if you can write this simple inference algorithm in about 30 lines of Haskell.)

Some Examples

foo1 text = unlines (map reverse (lines text))

Will the above definition pass the type checker? What is its type? Given that lines :: String -> [String] and unlines :: [String] -> String, then it can infer that text should be a String, can approve the use of map, and conclude that the result is going to be a String, i.e. conclude foo1 :: String -> String. No programmer annotations needed, and our safety net has passed the code for use.

Let’s try some type-driven development now. Suppose I want a function group_on that works a bit like Ruby’s group_by, e.g. where (1..10).group_by {|x| x % 2 == 0 } gives {false=>[1, 3, 5, 7, 9], true=>[2, 4, 6, 8, 10]}. For the sake of making the example a bit more complex, I’m going to do it via a wrapper around one of the Haskell library functions. (It could also be done as a fold loop that adds one element to a hash table on each step.) A Haskell List library function groupBy :: (a -> a -> Bool) -> [a] -> [[a]] can group adjacent equal elements, e.g. groupBy (==) “Mississippi” gives [“M”,”i”,”ss”,”i”,”ss”,”i”,”pp”,”i”], but it’s not quite what we want, so we need some wrapping. Let’s get the type system to help us.

To start with, what’s the type we want? Informally, we split a list of something into a hash or map, where the key is the result of some operation and the value is a list of the original items that gave that value. Notice that we’re not talking concrete types so we can expect to make this (parametrically) polymorphic — we can use the same piece of code no matter what the actual payload data is. Instead of a Hash, I’ll use a simpler type of a list of pairs, e.g. for the above example we expect [(false, [1,3,5,7,9]), (true, [2,4,6,8,10])]

First attempt: list in, list of pairs out, so we come up with something like group_on :: [a] -> [(b, [a])] as the type to aim for. But we need a way to generate the keys in the result, so need to pass in a function for this. Hence group_on :: (a -> b) -> [a] -> [(b, [a])] as the next guess. But — how do we know when two ‘b’ values are the same? We’ll need this to be able to group properly. Basically, we don’t know yet. At present, all we know is how to get ‘b’ values from ‘a’ values (with the first argument) and that we want to put ‘b’ values as the keys in the result.

This is actually a key point about polymorphism and worth exploring. Polymorphism in types encodes the minimum we can expect or require, and does not let us know anything else.

More concretely, such polymorphic code can’t mess around with the data it is working on, because it knows very little about it. This is great for enforcing modularity and such-like.

So how do we get to compare ‘b’ values? Well, we have to give the code a way to do it, and to keep it simple, I’ll pass in another function. Hence we now have group_on :: (b -> b -> Bool) -> (a -> b) -> [a] -> [(b, [a])]. Notice how the type now shows the minimum we have to supply in order to use it, and how it succinctly expresses the main aspect of what it’s going to do — convert our list into a hash. (Of course, it does not say everything — we’ll discuss that later.) Note how it’s also acting like part of a specification as well. What we have makes sense, and so next we can develop some code to match the type.

In an ideal world, our Haskell code editor could use the type information to help us build the definition from top to bottom, with us doing the creative parts and it managing the small, inferable details. In our example here, we know we want to use the groupBy library function wrapped up in some way, so we could start by writing ? (groupBy ? (? input)), where ? marks the bits we don’t know yet, and see what the editor infers for the unknowns — after which we can attack each unknown in turn. (This is a divide and conquer approach, of course.) The above is basically saying we want to do something before and something after the groupBy, but we’re not sure yet what it is.

We do have this technology now for dependent type languages like Agda, but as far as I know, it hasn’t quite made it down to Haskell yet (which is a shame…).

Back to the old-fashioned ASCII text file mode!

However, there is a nice trick to get some targeted support from your typechecker: instead of your unknowns, just use value () (i.e., empty parentheses) where you don’t know what to put. This is Haskell’s dummy or placeholder value, the one we use when some value is needed but we don’t want to use anything meaningful. It’s also the zero case for the range of notations for pairs (‘a’, True), triples (‘a’, True, 3), quads (‘a’, True, 3, [Nothing]), etc., i.e. () is a tuple with nothing in. When you use it for unknowns in your code, the type checker will produce a type error indicating that ()’s type (which is also written as ()) does not match the type expected at that part of the code. So we could write the following in our Haskell file and compile it:

group_on :: (b -> b -> Bool) -> (a -> b) -> [a] -> [(b,[a])]
group_on same eval inp = () (groupBy () (() inp))

And instantly get a type error complaining that () :: () (the right-most one) does not have a type a -> b because it’s being used as a function in the context of () inp (i.e. applied to the “inp” argument), but it doesn’t match the general a -> b type pattern for functions. We can then start trying to refine the () values with more concrete versions, and work through the consequences. For example, we might guess () should be a map of something, so replace with map (). Then, realize that the new () should be a function, hence map (\x -> ()). At this point, the checker is satisfied here and it will now get us to work on the argument to groupBy. This technique might help some of you.

There are other techniques. My first article said “think data!,” so we can think about the input (a list of ‘a’ things), and the fact that we want to group them on the basis of some ‘b’ result, then consider various ways we can do this while using the library’s groupBy function.

If you’re not happy with the abstract view, it’s perfectly OK to think of concrete data instead and work through some examples on paper, maybe even draw pictures of the various stages of the operation. Also useful is to work with a concrete example in the interpreter’s REPL and see what you can build up. This technique really does work in FP, since many of your definitions will turn out to be short-ish anyway! It also works in Ruby, of course, so do try it if you haven’t already. So let’s do that. We know we want groupBy in there somewhere, and we can still use the () trick, so we can write the following at the prompt and see what it does:

groupBy () [1..10]

A type error of course, but it suggests that () should be a function of two arguments. Next iteration:

groupBy (\a b -> ()) [1..10]

Another type error, because () needs to be a Boolean, so let’s think about some test here. By luck, Haskell has even in its standard library. We want to group elements if they are both even or both not even, so try this.

groupBy (\a b -> even a == even b) [1..10]

Finally we get something, but [[1],[2],[3],[4],[5],[6],[7],[8],[9],[10]] is not what we want. But, groupBy groups adjacent equal elements, so we need to do a bit more work, maybe trying to sort the list first.

Time passes.

Eventually, we find that the following works for our example:

map (\xs -> (even $ head xs, xs))
$ groupBy (\a b -> even a == even b)
$ sortBy (\a b -> even a `compare` even b) [1..10]

And then we can adapt it to code by pulling out the concrete details, hence:

group_on same eval inp
= map (\es -> (eval $ head es, es))
$ groupBy (\x y -> eval x `same` eval y)
$ sortBy (\x y -> eval x `compare` eval y) inp

We can simplify this in various ways, e.g.. avoiding repeat runs of the eval function, or using more overloading (as introduced below), but this will do for now. (One extra detail has slipped in, that we need to know how to order ‘b’ values, and I’m quietly allowing Haskell’s overloading to handle this.) It passes the type checker, and we can see pretty clearly how it works — a combination of sorting into a suitable order, grouping adjacents, then post-processing the result. One for you to consider: what testing should we do now?

Comfort Break

Here’s my second favourite joke. An eminent professor in computer science was once asked which was the best programming language. He (or maybe a she) pondered for a while then answered:

“Graduate student.”

Serious point though: we’re certainly not there yet, and we — everyone — have plenty more to explore first. Always reflect on what you’d like in a language rather than sticking to what existing languages give you and soldiering on. Thinking about type systems is part of this.

Here’s a relevant and poignant example from Java. If you’ve used Java, you might have needed the .clone() method a few times. Recall how it gets used though — typically like this:

Foo orig = Foo.new();Foo copy = (Foo) orig.clone();

Why is that cast needed on the second line? Is it just because that’s the Java way and you’ve got used to it? Or does it point to a weakness in the language? It is rather annoying to have to write down information that is very obvious to the programmer, which also adds to the “noise” in the program and obscures the “signal.”

Sadly, it is a language weakness: Java’s type system imposes certain constraints in order to get its version of inheritance to work “safely” and this means that inherited methods get restricted types, that an overridden method must return the same type as the method in its ancestors.

Like most interesting aspects of life, designing a type system is a bit of a balancing act or tradeoff, of juggling flexibility and articulacy with being able to retain useful properties and still be usable by programmers.

Some languages have a good balance, others not, and I’d place Java in the latter camp: the benefits are quite modest, and the costs too high. I would like to put Haskell in the first camp.

However, Java’s issues don’t mean that type checking in OOP is inherently problematic, or that the only alternative is dynamic checking — even if some people try to sell you this argument. There is much good work in the area that develops alternative approaches. I recommend Kim Bruce’s The Foundation of Object-oriented Languages (2002) as a useful starting point: it has a highly readable first section which surveys the main languages, including some illuminating discussion of Java’s design and implementation, then it proposes several attractive alternatives to Java. A key aspect of Bruce’s approach is keeping data and functionality separate, making serious use of interfaces for the latter.

More recently, Bruce has been a key mover in the design of Grace, which is intended as a solid teaching language for OO — and some of his ideas appear there too.

Interfaces and Type Classes

Interfaces, in the sense used here, work on several levels. Here’s a standard example from Haskell that will be familiar to Java programmers — the “comparable” interface. I’ll explain it in two stages, first an underlying equality test then the wider less-than-or-equal test.

class Eq a where
  (==) :: a -> a -> Bool 
class Eq a => Ord a where
  <=)    :: a -> a -> Bool
  compare :: a -> a -> Ordering     -- LT or EQ or GT

The technical term for entities like the above is “type class”, intuitively identifying a set of types that provide the listed functionality. I prefer the more informal term “interfaces”, since what we’re doing is describing some functionality that can be assumed for some type, or a promise of what a value in that type can provide. When you have that interface or promise, you can then write code that uses that interface.

The first definition describes an interface called Eq that promises a binary operator == that takes two values and returns a boolean. We can use this operator freely in other code, e.g. counting how many items in a list are equal to a given target, by scanning the list and selecting (i.e. filtering) which items are “the same” as some target, i.e. return True when compared to it.

count_same target list = length [ x | x <- list, x == target ]

There are other ways to write this definition, e.g. length . filter (\x -> x == target), but that’s a bit off topic. Now, how flexible is this code? Which types can we use it with? Well, the only thing we need to know or assume is that the underlying type has == defined for it. For the whole definition, we can say informally: we can count how many values in a list are the same as some target, as long as the target and list elements are the same type, and assuming we can compare values of that type. More formally, expressing this as a type in Haskell we have:

count_same :: Eq a => a -> [a] -> Int

… which literally says, assuming Eq is defined for some ‘a’, and given a value of type ‘a’ and a list whose elements are in the same base type, then we can produce an Int. The “fat arrow” is used to separate the “preconditions” part from the main type.

Like with the parametric polymorphism above, this type signature can be
inferred automatically by the Haskell compiler — we don’t have to work it out.

This is a huge difference from Java, where the compiler is just checking the type information and so the programmer needs to write in all of the type information explicitly. Haskell instead is doing some type inference first, to work out what the type should be, then does some type checking to ensure that your code is being used appropriately.

I’ll say it again, because it is very important: for a lot of code in Haskell, you can just write down the code and the compiler will infer what the types should be, then check that the types are being used consistently. You do not need to write down lots of type information. It’s a bit like programming in Ruby (little or no type baggage) and enjoys some of the benefits of dynamic languages — but with a good safety net in the background.

There’s one more detail to explain from the definition of Eq, and that’s the type signature of Eq a => a -> a -> Bool required for (or that can be assumed for) the equality test. That is, assuming Eq is defined for some type ‘a’, then the equality test can compare two ‘a’ values and return a Bool. So True == False is fine typewise, since replacing ‘a’ with Bool gives Bool -> Bool -> Bool and this matches how we’re using it. But, we can’t do True == (True,True) since Bool and (Bool, Bool) are different types. It doesn’t make sense (in this context) to compare different things for equality. Note also that this restriction is being checked at compile-time, not at run-time, and we certainly don’t have to write the code to check this for this ourselves as we do in Java. Neither do we need to add in type casts either — which is another limitation of Java’s type system. Incidentally, Kim Bruce’s OO-style LOOM and LOOJ languages lift this restriction by adopting similar techniques to Haskell’s type classes. (For more info, see my article on DCI and interfaces .

So, that’s equality. The next interface provides ordering tests, and it’s introduced as a sub-interface of equality. That is, we’re extending the equality interface with a <= test, with the effect that we can assume == is defined if <= is defined. Alternatively, if we want to define <= then we have to define == too. It’s a reasonable convention, but you can define a different version if you want, i.e., it’s not permanently baked in. (The type class also prescribes an often-useful general comparison test compare which distinguishes between less-than, equals and greater-than cases. This is the detail that snuck into my group_on example — as a result of wanting to sort on the ‘b’ values.)

There are several ways to equip types with suitable definitions for the interface functions, similar to Java’s “implements” mechanism. Firstly, we can do it explicitly by providing code for the functions. Below, we just give a direct definition of equality testing for Booleans. For certain standard type classes, the compiler can automatically derive conventional of relevant functions (including the code below), thus saving us work (and avoiding mistakes), though we won’t use it here:

instance Bool where 
  True == True = True 
  False == False = True
  _       == _    = False

Here’s a slightly more complex instance definition, concerning equality of lists:

instance Eq a => Eq [a] where              -- 1
  [] == [] = True                          -- 2
  (x:xs) == (y:ys) = x == y && xs == ys    -- 3
  _ == _ = False                           -- 4

Line 1 indicates that this code provides a definition of equality for a list with element type ‘a’, assuming that we already have equality defined for ‘a’ itself, which is quite reasonable. The definition by cases says (2) empty lists are equal,(3) that when both lists are non-empty then they are equal only if the two head (or first) elements match and the list tails (the rest) match as well, and finally (4) any other case is not equal.

Notice how == is being used with different types on line 3 — the first is comparing elements of the list, and the second is comparing lists. This is overloading, where the same name (==) is being used safely with different types, and in particular, the language mechanism is selecting appropriate code to run for each case depending on what the types are. Haskell’s class mechanism was originally designed to handle such overloading in a convenient and sound way, though we’ve since found it has other great uses too.

Consider what happens when we compare [[[True], [False]]] and [[[True],[True]]]? I.e., a 3-deep nested list of boolean values, or [[[Bool]]]. Behind the scenes, Haskell will construct a “dictionary” that contains code for doing the comparison, and effectively uses the list equality code three times plus equality on Bool to build code that runs the precise code we need. Put another way, it uses the type information to build the code it needs, without us having to spell it out. This adds a lot of power and flexibility.

What do these interfaces provide?

I think there are several related benefits for programming:

Firstly, it’s a great documentation facility — it allows us to clearly say what we’re assuming for certain concepts, and to say what assumptions are behind the code using the interfaces. This alone is worth the price of admission(!)
Rationalization by programming to an interface: we say exactly what we can assume, and our code is only able to use what we’ve said it can. In particular, it limits the scope for exploiting other details of the data — which greatly helps with modularity and security.
There are benefits for abstraction, reuse and maintainability too — by assuming only as much as we need, then the code is less likely to contain unnecessary details, and hence be more suitable for use elsewhere; plus, it being simpler then it’s going to be easier to change (in fact, if suitably abstract then maybe no changes will be needed, since it has isolated various aspects from each other).
Safety — e.g. compare some uses of modules in Ruby, where code is included and has to rely on the free variables (or external references) in code being available in the class it’s included into. The interface says precisely what is assumed, and the type checker ensures only that is used. (This kind of finger-crossing in Ruby is a detail I find particularly concerning — it is quite a bad code smell in my view. I don’t believe we have to accept this weakness.)
Convenience & overloading: quite often we want to use the same concept, like equality, with various types, but we don’t want to pollute our namespace with type-specific names. One name should do for all. Haskell’s type system retrieves the particular piece of code to use with a type, just by examining the type of the values involved.
Ad hoc polymorphism: Parametric polymorphism is about using the same code for everything, and can be contrasted against a kind of polymorphism where everything “behaves” the same by virtue of responding to the same interface, usually via type-specific code. This is known as “ad hoc” polymorphism, in the sense of “determined by case.”

Ruby can simulate some of the behaviour, but do notice the value of being explicit and clear about interface use — and not relying on certain implicit features of Ruby. So, is it worth adding some kind of explicit interface feature to Ruby? Is the tradeoff favourable, particularly if DCI or similar patterns get more popular? Answers on a postcard please.

It’s useful to consider what kind of language feature this is, and how it works.
Firstly, it’s not magic, in the sense that we can represent the underlying mechanisms inside Haskell itself and do some of the leg-work ourselves. Interfaces are basically records (or dictionaries) whose fields are (mostly) functions, and dependence on interfaces equates to requiring relevant records to be passed in to provide the needed implementations. We can build such records by constant records, or using functions to convert records for small types into records for larger types (i.e. the list equality code above). The type-checker contains an inference component that ensures that relevant dictionaries get created. Finally, there is some syntactic sugar to allow these dictionaries to be used smoothly and consistently (with some efficiency too).

In summary, this is a mechanism that can mostly be programmed within the language, and just uses some extra support to make it more usable and convenient. I thought it worth highlighting this, just to help you visualize how such ideas might apply in other languages, e.g. it’s not too big a job to simulate it in Ruby.

Some Real-World Examples with Interfaces

There has been a lot of interest recently in DCI (Data, Context, Interaction). DCI is an architectural idea developed by Reenskaug — who also proposed MVC — for simplifying and decoupling components in OO programming, basically by encouraging separation of an object’s data from the various roles that it plays, and only equipping the object with the role functionality it needs in a particular context. For example, a person could have a role as an employee, and a role as a parent, and the employee part of the application will rarely need to know about the parent role.

This DCI approach echoes some of the ideas from Kim Bruce’s work in its separation of data and functionality. It’s reassuring to see the same core idea arise from different approaches; from Reenskaug trying to make OO programs more intelligible, and from Bruce trying to rationalise and improve OO type systems.

Now, we can write Ruby code in a DCI style with some extra legwork. Roles can be encoded as modules, then imported into particular instances to equip that instance with the appropriate functionality. We reduce the size of model classes, and gain reusable components too. It works, but, language-wise, this seems a bit “Heath Robinson.”

Can we do better? Can we add type information in a useful way that also aids checking of the code and doesn’t eat into the flexibility? I believe yes to both. We can set up Haskell-style interfaces to make clear what we are assuming about the source data, and type-checking ensures that we use these interfaces appropriately. Furthermore, the dictionary building process automatically assembles the relevant combination of roles needed for any piece of code. (You can find more detail here.)

The next example was inspired by a talk on Hexagonal Rails at the 2012 Scottish Ruby conference. Delivered by Matt Wynne (of Cucumber Book and Cucumber Recipes fame), Steve Tooke and Kevin Rutherford, this talk discussed the need for avoiding deep hierarchical architectures and encouraging a flatter style that is conceptually simpler and also leads to less coupling in code. More modularity and less coupling tends to make code easier to maintain and to understand. They also presented some experiments in restructuring Rails code along these lines, and the experimentation continued into a popular follow-on session where groups of us tried out our own ideas.

It struck me how such experiments and related thinking were being limited by the constructs available in Ruby. The attempts at abstracting controller functionality had to be represented directly in code, were partly being obscured by code details, and there was no easy way to take a more abstract view. Put another way, a good type system is a powerful language for design too, and in the discussions the lack of such a system was telling.

My response was to start sketching out some code that used interfaces and polymorphism to enforce a clean separation of concerns. The persistence role can be managed via an interface that just allows saving and loading of data. Validity of data can be separated off to a layer independent of persistence. Mapping to URLs can also be abstracted out.

Finally, the update action at the controller level can just pull together the functionality from the interfaces (aka roles in DCI terms) and provide a simple connection between them. Such code would only know that (a) it can build some domain-specific data from the form, (b) can try to persist it or get an error message back, and © determine the URLS for the next step. Such code will never be able to access domain details or business logic.

By virtue of the overloading, we can also use the same piece of code for most or all controllers in the app. Here’s a brief and informal excerpt.

— provides ability to save etc 
class Valid a => Persist m a where
  save :: a -> m (Either ErrorMsg a)
— allows mapping to a URL 
class Url r where
  show :: r -> Url 
  […]
create_ constr args
 = do res <- save (constr args)                  — 1 
    case res of                                  — 2
      Right r -> redirect (show r)               — 3 
      Left msg -> render (template (Just msg))   — 4

Line (1) builds an object from the incoming params, given a suitable
constructor; line (2) checks the outcome of persisting the new object; then if
OK (line 3) jump to the show page, else (line 4) send the user back to the
form with some error message.

Notice how the details of persistence are very separate from details of Url
routes, and there’s no chance of interference. The controller layer is restricted
to mediating between the core app (for building data and saving it) and its
web interface. Quite naturally, the interactions are flatter, and indeed only
use the minimum info allowed through the interfaces. Notice the DCI feel to
this too, where the data’s roles are separate and only combined when required.

It will be interesting to convert this approach back into Ruby code.

Pros and Cons — the 80–20 Rule

So, that covers the main ideas from Haskell-98’s type system. Let’s evaluate
it.

I described the design of type systems (and programming languages) as a
balancing act: you want the articulacy (freedom to say interesting things) but
want to retain some useful automation, ease of use, and conceptual consistency. It is very rare to get both.

Haskell manages a pretty good balance though, I believe. The type system is
pretty versatile, and only requires explicit hints when you’re using complex
overloading. What can’t Haskell do, or do nicely?

Well, most of the things I’ll cover soon under dependent types! But there are some aspects of traditional OOP that don’t translate nicely, primarily the ability to build lists of objects of unrestricted mixed types (“heterogeneous lists”), i.e. we can’t write [2, ‘a’, False] in standard Haskell. There are some extensions that allow various cases of such lists, each with their own advantages and disadvantages. If we know in advance which types to expect, then we can code it up as a tagged union and do the decoding ourselves. There are other techniques like reducing the values to a common interface, so they all look the same and respond to the same methods, even if they are different underneath.

It is appropriate to invoke an 80–20 rule here.

In my experience, comparing Haskell to its mainstream peers, (much) more than 80% of my code fits adequately inside Haskell and the remaining fraction isn’t too painful, certainly no deal-breakers — so I’d rather have the convenience and benefits for the 80% part and live with a few awkward bits for a small section of the code. It’s certainly not worth discarding the benefits for the 80% and moving to a dynamic language just because of some issues with a minority part of the program. The 80–20 rule also applies to the programming side of Haskell too — more than 80% of my code works nicely in a pure functional setting, and the awkward stuff takes up less than 20%, and often is much less.

So, what does this say about dynamic typing?

My claim is that Haskell & Co are approaching a level of flexibility and unobtrusiveness that they are becoming a credible alternative, and so many of the traditional objections to static types and traditional justifications of dynamic types are starting to lose their force and so need to be re-evaluated.

Some of these objections are a bit wooly, e.g. drawing conclusions about Haskell from conclusions about Java. One recent Ruby book spends a few pages arguing in favor of dynamic typing because it doesn’t need extensive type annotations. It’s a good advertisement for Haskell too(!)

There are still some phenomena that even the very advanced type systems can’t handle well, but these tend to be rare or contrived, and on balance, I personally would prefer help with the 80% or more of my program that can be covered, and do the best I can on the rest. The alternative, of abandoning help on the 80% just to accommodate the minority aspects, is not too appealing from an engineering standpoint.

Now, what do you think?

Beyond Haskell: Dependent Types

Haskell is something of a “sweet spot” in language design, balancing a reasonably flexible type system with useful inference properties, but it is by no means the best we can do. So what else is there?

Language designers continually experiment with modest extensions to Haskell’s basic system, to see if they can eke out any extra functionality without making the language too hard to use, and mileage varies here.
Instead, let’s look at the next big step in language sophistication, where we allow types to become full first-class citizens of our language. The Haskell we’ve seen so far still keeps the world of data separate from the world of types, i.e. types are a description of happenings in data-land and they do not meet, and this puts some limits on what we can articulate.

Dependent type theory lifts many of the limits. Data and code can appear in types, and types can appear inside code as data. We end up with a single uniform language, not two (data vs types), and can start to do some very interesting things in this uniform language. Quick, let’s see some practical examples!

reverse :: (A:Type)(n:Nat) Vec A n -> Vec A n

Haskell’s reverse just promised to give a list of the same type back. With dependent types, we can include information about list length too, hence encode that reverse preserves type and input size.

The signature literally says, given some type A (for elements) and a length n, then reverse takes a vector (our name for a sized list) of size n and returns a vector of the same size. So it’s a guarantee that we don’t add or lose elements.

How about appending one vector to another?

The following says that appending vectors of size m and n gives a vector of
their combined size, m + n!

vappend :: (A:type)(m,n:Nat) Vec A m -> Vec A n -> Vec A (m + n)

So, notice how we’ve taken a step up in the kinds of information we can represent inside our types, with consequences for what the compiler can do with it! We get more articulacy in data definitions too, e.g. to build a sized vectors we need only the following, a vnil constructor that produces a vector of size 0, and a vcons constructor that adds an element onto a vector of size n to get one of size (n + 1).

vnil :: (A:Type) Vec A 0
vcons :: (A:Type)(n:Nat) A -> Vec A n -> Vec A (n+1)

It’s now impossible to get the size of the vector wrong, and we can never use vnil where a vector of size 1 or more is expected (since 0 != n + 1). This has implications for pattern matching too, e.g., if we define the head function for a non-empty vector (size n + 1) then the vnil pattern is impossible and never needs to be handled.

This next bit may seriously blow your mind.

Propositions Are Types, and Proofs Are Programs

We can use any values of arbitrary complexity inside type expressions, not just simple values like numbers. We can even encode logical propositions too, and use these to express pre and post conditions, and more. For example, the following represents a property of equality, that when two things are equal then what we can do with one thing we can do with the other.

foo2 :: (A:Type)(P:A -> Type)(a,b:A)Eq A a b -> P(a) -> P(b)

We can do more than just encode logical propositions: we can work with proofs of them entirely inside the language as well. If you’ve done any logic before, you may recognise the following as a statement of transitivity of implication, i.e. “if B implies C, and A implies B, then A implies C.”

(B -> C) -> (A -> B) -> (A -> C)

How do we “prove” this? The usual way is to find the combination of inference rules that yield the above statement as the final conclusion. Here’s a better proof:

f . g = \x -> f (g x)

No, this isn’t a typo. It is the definition of function composition. Recall that its polymorphic type in Haskell is (b -> c) -> (a -> b) -> (a -> c). Look familiar? This is no accident. There is a striking similarity between how functions work and how logical implications work, and we can use this as a way to encode proofs of propositions as programs. It has a name too, in fact several names because the idea appeared in several places independently around 60 years ago. I’ll use the conventional “Curry-Howard Correspondence” for now.

Pause to consider the huge significance of this idea!

Checking validity of a proof is then a case of checking the type of the program. Finding a proof equates to developing a program with the required type (go tease some mathematicians!). We can also think of the program as a way to generate proofs of the conclusion from proofs of the hypotheses: e.g., if we have a proof of B -> C and one of A -> B then we can use the definition of function composition to build a proof of A -> C! So function composition is a bit like a theorem we can apply to get from one stage of reasoning to another.

How about mathematical induction? Ever noticed that it is simple recursion? We can write down the principle as a type that says a lot of what we understand about induction: we have some property P to establish for all numbers; then a proof that P holds for 0 (the base case); and a proof that if P holds for some m then it holds for m + 1 (the step case); finally, if given some n then the overall result is a proof of P for that n.

nat_induction :: (P : Nat -> Prop)
                 (base : P 0)
                 (step : (m:Nat)P m -> P (m+1))
                 (n : Nat) P n

What does the proof of this look like? It’s a recursive function that works through n to generate the proof it requires! I.e., nat_induction P base step 3
yields step 2 (step 1 (step 0 base)).

So, the more powerful language allows us to encode complex logical propositions as (dependent) types and the programming language inside it allows us to develop and use proofs, all inside the same language, no external tools needed. This really does represent a language that combines — in a clean, elegant, and powerful way — both computation and reasoning.

Another Look at Sorting

We started off looking at the type of sort, and now know enough to see what we can really do with dependent types.

     A:Type           n:Nat           v:Vec A n
     lteq:Sigma[f: A -> A -> Bool]is_ordering(f)
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
sort A n lteq v : Sigma[w:Vec A n] is_permutation_of(v,w)
                                && is_sorted(w,lteq)

Don’t panic! Let’s take it piece by piece. As before, sorting is polymorphic, so we can use it on vectors of any element type. Input length can be anything, and the result contains a vector of the same length, i.e. length is preserved. We also have an argument lteq that supplies the information about comparing A values — and as you may have guessed, this will be our less-than-or-equal test, where a result of True means “don’t swap” and False means “do swap”.

The extra pieces are to do with properties we require from the input, and properties that will hold for the result. For the lteq function, not just any 2-ary predicate will do, e.g. using a function that always returns True will not produce a sorted result. Briefly, the construct Sigma[x:T]P(x) is pairing some data x with some more data P(x) that depends on x. Above, we’re pairing the lteq function with a proof that it has the properties required for less-than-or-equal. And in the result, we get the result of sorting plus proofs that the output is a permutation of the input (i.e., we’re rearranging data) and that the result really is in the sorted order implied by our choice of lteq.

So yes, this is more detailed than the types we saw at the start, but if these are the properties of sorting that we care about, then no further testing is required. The type checker has it covered. And if we change the code then either the code still type checks, or we have to justify why the change is safe by modifying the code so it does type check.

Are dependent types more verbose? Yes, because there is more going on. However, there are ways to hide away some of the obvious information and push the core ideas to the fore. In the above, A and n are easily inferred from the context, and thus rarely shown, so leaving the important inputs of the comparison function and the vector to be sorted. One thread of research in this area looks at other kinds of abbreviation mechanism to help with practical programming, such as making it easier to bridge between simple data and more complex types (which was one of my specialties).

We’re also making great progress with tool support. It’s often suggested that type information can be used by code editors to make informed decisions about editing support, and this is particularly true with dependent types. There is much more information available in a processable format (i.e. not just in comments or in the programmer’s head), and this can be used to great effect: really powerful type-directed development. Such tools begin with some statement of what we want to develop, represented as a type, then we use a mix of backwards and forwards reasoning to develop the program. We can leave some details as unknown too, kind of what we simulated in Haskell with the () value, but here the tools provide much more powerful handling of unknowns, inferring immediately obvious values and providing kinds of automatic inference for the more complex steps.

Here’s a quick example: developing a proof of (B -> C) -> (A -> B) -> (A -> C). This type is set as our initial goal, and the corresponding proof (i.e. program) as the first unknown. We can shift the arguments into the context to get f : B -> C, g : A -> B, x : A with a new goal of ? : C. That is, we need some combination of f,g,x that generates a C value. We can fill in the full answer if we want, or we can build it up in pieces. For example, we can suggest f ? as the solution to the goal, and get a second goal ? : B, i.e. the argument we need to pass to f. That goal is solved by g x. The editor will then show the final result, that our proof/program is \ f g x -> f (g x). Some editors can also do advanced analysis on the structure of goals and fill in some of the code automatically, such as presenting us with only the data patterns that can occur and leaving the right-hand sides to us.

These tools are very powerful, and have been used to help fully and completely build formal proofs of real results like the Four-color Theorem. They also represent a very different and powerful way to program, and have much potential.

If you want to know more about dependent types right away, I suggest you start looking at the Idris language for a language in the Haskell style. There
are more advanced languages (and tools), such as Agda or Epigram or Coq or
Cayenne, though they can be a bit steep as an introduction.

Back to Earth

I hope you appreciate now that advances in type systems are starting to converge with ideas in TDD. More work is needed, of course, but the exciting possibilities are clear.

What can TDD learn from tests? What can types learn from TDD? Personally, I’m in a strange position, of using TDD for Rails work but also aware of the possibilities in types. It’s, um, interesting. And annoying, at times. In the long term, I envisage a framework where the distinction between types and tests isn’t so significant, and we are all able to work flexibly and confidently with good support from our tools. I have no idea what we should call this though!

In the meantime, I think types work can benefit from TDD in several ways:

be encouraged by the great ideas and range of work in this area,
particularly real-world code
alternative perspectives on code verification and development practices
lots of case studies to think about, analyze, and maybe adapt/translate

And how about this for TDD?

Stronger DSL use to encode the properties we care about, and making that DSL more flexible. Specs encoded as types tend to be more succinct than TDD specs, and I believe it’s worth considering how the DSL for specs can be streamlined in a similar way.
Encouraging more decomposition of code to promote testability. Types get unwieldy if we try to do too much in a piece of code, so functional programmers often break such code into pieces and develop them independently. It’s well known in TDD that some code is more testable than other code, typically code that is smaller, more modular, less reliant on side effects. Can we promote this in TDD by orienting the DSL for specs toward such code, making the more complex cases less attractive?
Approach to side effects. Side-effects can be a pain in type checking, and it seems people are realizing this in tests too, as they make it harder to isolate the things that need to be tested. Again, can the DSL be designed to promote this aspect of testability?
Other approaches to mock objects? Mocks seem to cause a lot of confusion. The purpose is to set up fake inputs to some test with predictable behaviour such that certain paths in the tested code are exercised. Put another way, it’s setting up some preconditions or assumptions under which the test should be run. We can do this in a dependent types setting in a more abstract way by passing in hypotheses of how such objects behave, e.g. passing in a value of type (x:A) Eq (f x) (x + 3) indicates that some f adds 3 onto its input, and we can use this hypothesis inside the proof of the test to get values from uses of f. So, instead of building objects that have some run-time behavior, we instead frame that behaviour via hypotheses. This view may also help simplify the DSL for mock-like behavior.
Why do tests have to be in a separate file? Is this a practical detail or a theoretical detail? In the types world, we believe the types and code belong together. Is this possible for tests?
Using tests to drive code generation. Given some type signature, the tools can sometimes predict what structure the corresponding code should have, and this is something dependent types people use often to cut down on work. Can the same be done for TDD specs? It seems feasible, e.g. to parse a TDD spec and generate as much of the corresponding class as possible, maybe including code fragments, error messages, etc., helpful comments to simplify the development work and side-step certain transcription errors. Changes in the spec could automatically flag diff conflicts in the code. (I wish I had time to try this.)

Indeed, considering ideas such as these may pave the way to better inter-operability in future.

Final thought: how far can existing tests be translated to types? Or existing types be translated to tests? I think the overlap is significant. That says something.

About the Author

Dr. Paul Callaghan rides a big, fast motorbike, and suggests that this informs his programming style too. Some people say bike riding is too risky, but you can manage the risk in various ways. Defensive riding means looking ahead, anticipating problems, and actively minimizing risk by avoiding trouble. One can program like this too.

Paul also flies big traction kites and can often be seen being dragged around inelegantly on the beaches of North-east England, much to the amusement of his kids. He blogs at free-variable.org.

Cover from PragPub magazine, September 2012