Some Notes on Haskell Pedagogy

Steven Syrek
18 min readMay 5, 2017

--

This article is meant to accompany What a Haskell Study Group is Not, where I offer what wisdom I have on how best to collaborate on learning this many-layered language. Here, I broadly outline what a pedagogical interaction might look like. The point isn’t to force you into a straightjacket but to suggest ways of learning — and teaching — by doing rather than explaining. Of course, some explanation is always necessary, but all the speeches in the world aren’t going to help anyone learn Haskell. Students don’t need lectures, and they don’t need convincing. They need hands-on experience and as much of it as possible. They need to focus on code and take up residence in the REPL. Ultimately, they need to tame the compiler and make the language their own. You may consider what follows a model for how you might actually run a study session. Obviously, I can’t reproduce a lived experience in written form, but I hope to give you some idea of the process I myself try to follow when working through a coding exercise.

We would like to define a custom List datatype and implement its Functor instance. Let's start with the datatype:

data List a = EmptyList | ListElement a (List a)

The | operator indicates that this is a sum type, and the presence of (List a) in the ListElement data constructor indicates that this type is recursively defined. This information will be relevant later. For now, let's test our type by making sure we can actually use it to create the values we expect:

> testList = ListElement 1 (ListElement 2 (ListElement 3 EmptyList))
> testList

<interactive>:3:1: error:
• No instance for (Show (List a0)) arising from a use of ‘print’
• In the first argument of ‘print’, namely ‘it’
In a stmt of an interactive GHCi command: print it

We forgot to derive an instance of Show for our List datatype. This looks like a trivial mistake, but it's exactly the sort of error a beginner is likely to make. Actually, seasoned programmers also make these sorts of mistakes all the time, but for a Haskell newbie, it's at least an easy error message to parse and an easy bug to fix:

data List a = EmptyList | ListElement a (List a)
deriving Show

When we reload our source file and try again:

> testList
ListElement 1 (ListElement 2 (ListElement 3 EmptyList))

It works! Next, let’s scaffold a Functor instance for our List:

instance Functor List where
fmap = undefined

This also compiles. But what if we tried to break this instance on purpose?

instance Functor (List a) where
fmap = undefined

Since our datatype is a List a, it may seem sensible to implement Functor for List a, too. But when we try to compile this code, we get another informative error:

list.hs:8:19: error:
• Expecting one fewer argument to ‘List a’
Expected kind ‘* -> *’, but ‘List a’ has kind ‘*’
• In the first argument of ‘Functor’, namely ‘List a’
In the instance declaration for ‘Functor (List a)’
Failed, modules loaded: none.

Depending on where you (or your fellow study group members) are in your exploration of Haskell, this is either an opportunity for a discussion of kinds or the premature introduction of an advanced topic. But one useful aspect of Haskell is the ease with which you can explain any given concept by analogy with other, simpler concepts. Since “everything is a function,” and there are few arbitrary language constructs, you can learn unfamiliar ideas using what you already know as a basis — and do so effectively. The above error is not too hard to contend with, even for beginners. So how should we explain it?

A kind is the “type of a type.” Think of the List type as just another function that takes an a as an argument, but at the level of types. That is, List takes an a as an argument and returns a type List a, just as a regular function, a -> b takes an argument of type a and returns a value of type b. Functor is a typeclass for datatypes with kind * -> *. That pretty much looks like a function. So we can presume that a functor instance "takes" a function-like-thing (that is, a type constructor) with kind * -> * as an argument and returns a functor with kind *, just as we might have a function at the value level (a -> b) -> c that takes a function a -> b and returns a value of type c. We can examine kinds as easily as types in the REPL:

> :k List
List :: * -> *
> :k List Int
List Int :: *

The unapplied List constructor has kind * -> *, which is what we want, while a fully applied List a, here made concrete with Int, is only kind *, which doesn't work for Functor. That's why we declare our instance with List alone and not the fully-parameterized List a.

The type signature of fmap provides similar information:

> :t fmap
fmap :: Functor f => (a -> b) -> f a -> f b

The f a it takes as its second argument and the f b it returns are just function applications at the level of types. So List a and * -> * match, just as, for example, ListElement 1 matches Num a => List a -> List a. In the first case, we must apply the type constructor List to the type variable a in order to create a type. In the second, we apply the data constructor ListElement to 1 in order to create a value of that type. Functor can only be implemented for types with kind * -> *, because given a type constructor, such as List, and a type variable, such as a, it needs to work for any type that might inhabit that a. If you made that a concrete in the definition of your Functor, it would no longer be polymorphic. In fact, you couldn't do anything at all with it. Much like the EmptyList data constructor, it would stand only for itself—and that's not what we want.

As a side note, this is a good place to point out that it is understandable if people learning Haskell become confused as to the distinction between kinds, types, and values. But don’t confuse terminology for complexity. The designers of the language had to assign names to things just so we could talk about them. And they really are just names. There is already a proposal to eliminate the distinction between types and kinds, and some languages dispatch with the barrier between types and values, too. As far as the Haskell compiler is concerned, everything really is a function. But we humans need to distinguish between different sorts of functions. The analogy between value-level and type-level functions should be enabling, not obfuscating. Address incomprehension and despair with patience and explication, and the language will help you.

Moving on to the actual implementation of our functor, we should start by returning to the type of fmap. Remember from above that it is Functor f => (a -> b) -> f a -> f b . Reading the type informs us that the arguments to our own fmap function must be a function and some value that is a functor, indicated by the type constraint on f. Let's fill that in:

fmap f xs = undefined

If we can’t figure out what to do next, we can plug in a “hole” and see what the type checker tells us:

fmap f xs = _

Attempting to compile that, we get a nice error:

list.hs:5:15: error:
• Found hole: _ :: List b
Where: ‘b’ is a rigid type variable bound by
the type signature for:
fmap :: forall a b. (a -> b) -> List a -> List b
at list.hs:5:3
• In the expression: _
In an equation for ‘fmap’: fmap f xs = _
In the instance declaration for ‘Functor List’
• Relevant bindings include
xs :: List a (bound at list.hs:5:10)
f :: a -> b (bound at list.hs:5:8)
fmap :: (a -> b) -> List a -> List b
(bound at list.hs:5:3)
Failed, modules loaded: none.

As the error message indicates, our hole has the type List b, which is what we expect, since that's the return value of the entire function. The two parameters we defined, f and xs, also have the types we expect: a -> b and List a, respectively. The key here is the type of the entire fmap function, specialized to our List type: fmap :: (a -> b) -> List a -> List b. If we need visual reinforcement during our exercises, we can go ahead and use a compiler extension and insert that right into our code:

{-# LANGUAGE InstanceSigs #-}

data List a = EmptyList | ListElement a (List a)
deriving Show

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap f xs = _

Making those higher-kinded types more concrete can really help us when writing these implementations. Here, we want to apply the function a -> b to the a trapped inside of List a so we can transform it into a List b. So we need to get to that a somehow. When Haskell gurus talk about "type tetris," this is what they mean—i.e., it isn't rocket science even if it feels like a puzzle. This is a good place to demonstrate destructuring:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap f (ListElement x xs) = _

Now that we’ve broken down the second parameter xs into its constituent parts, the data constructor ListElement and its two arguments x and xs, standing in respectively for the a and List a in our data declaration, we compile again and get a new error:

list.hs:8:31: error:
• Found hole: _ :: List b
Where: ‘b’ is a rigid type variable bound by
the type signature for:
fmap :: forall a b. (a -> b) -> List a -> List b
at list.hs:7:11
• In the expression: _
In an equation for ‘fmap’: fmap f (ListElement x xs) = _
In the instance declaration for ‘Functor List’
• Relevant bindings include
xs :: List a (bound at list.hs:8:25)
x :: a (bound at list.hs:8:23)
f :: a -> b (bound at list.hs:8:8)
fmap :: (a -> b) -> List a -> List b
(bound at list.hs:8:3)
Failed, modules loaded: none.

This looks similar to the error above, but now we see that we have successfully extracted the a from List a and bound its value to x. With this information, we can easily rewrite our fmap definition to get rid of the hole:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap f (ListElement x xs) = ListElement (f x) xs

This definition seems plausible. We have a function f of type a -> b and we have a variable x of type a. So we apply f to x and remember to wrap the whole thing in a new List in order to match the data constructor ListElement a (List a). We stick the xs on the end, because maybe we're not sure what to do with that yet. Try it out:

list.hs:8:49: error:
• Couldn't match type ‘a’ with ‘b’
‘a’ is a rigid type variable bound by
the type signature for:
fmap :: forall a b. (a -> b) -> List a -> List b
at list.hs:7:11
‘b’ is a rigid type variable bound by
the type signature for:
fmap :: forall a b. (a -> b) -> List a -> List b
at list.hs:7:11
Expected type: List b
Actual type: List a
• In the second argument of ‘ListElement’, namely ‘xs’
In the expression: ListElement (f x) xs
In an equation for ‘fmap’:
fmap f (ListElement x xs) = ListElement (f x) xs
• Relevant bindings include
xs :: List a (bound at list.hs:8:25)
x :: a (bound at list.hs:8:23)
f :: a -> b (bound at list.hs:8:8)
fmap :: (a -> b) -> List a -> List b
(bound at list.hs:8:3)
Failed, modules loaded: none.

Let’s focus on the most informative part of this lengthy compiler complaint:

Expected type: List b
Actual type: List a

We got a List as output, but it's a List with elements of type a, not type b. Why? We transformed the x from an a into a b, but we didn't do anything with the xs. They could contain the same type as List a, but they aren't necessarily the same given the definition of fmap, which specifies a transformation from a to b. That is, a could be the same type as b, but it doesn't have to be, and we need to allow for that possibility. We have to write our function in a way that allows, therefore, for the transformation of xs. Since we defined our List type recursively, we must also recursively define what it means to map over that type. Recur, rinse, repeat:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap f (ListElement x xs) = ListElement (f x) (fmap f xs)

And that will compile. Now we can test it:

> testList = ListElement 1 (ListElement 2 (ListElement 3 EmptyList))
> fmap (+1) testList
ListElement 2 (ListElement 3 (ListElement 4 *** Exception: list.hs:8:3-59: Non-exhaustive patterns in function fmap

Oh no! Just when we thought we had figured out the hard part, the compiler reminds us that we still have work to do. Notice where the exception kicked in. We actually did apply the (+1) function successfully to our three list elements, but then the compiler didn't know what to do with the EmptyList it encountered. We have to deal with that case, which is the base case of our recursion. So:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap f EmptyList = EmptyList
fmap f (ListElement x xs) = ListElement (f x) (fmap f xs)

Let’s try that same test one more time:

> fmap (+1) testList
ListElement 2 (ListElement 3 (ListElement 4 EmptyList))

Huzzah! We have successfully implemented Functor for our List datatype. Take a time out to appreciate all those compiler errors. They were probably annoying at first. A lesser programmer might have complained about that last one, in particular, because naturally we can always account for "edge cases" ourselves. More likely, anyone going through this process will see the value in working with a compiler that forces you to be as correct and comprehensive as possible at the outset, not later on when you won't even remember what you were thinking when you wrote this code.

And seriously, isn’t pattern matching beautiful?

If you really want to check your work — or if you just couldn’t figure this one out — there is even a way to have the compiler solve the problem for you. Simply add another language pragma to the top of your file and derive your instance automatically:

{-# LANGUAGE DeriveFunctor #-}

data List a = EmptyList | ListElement a (List a)
deriving (Functor, Show)

Then, when you reload the file into ghci (or compile it with ghc), use the
-ddump-deriv flag:

> ghci list.hs -ddump-deriv
GHCi, version 8.0.1: http://www.haskell.org/ghc/ :? for help
[1 of 1] Compiling Main ( list.hs, interpreted )

==================== Derived instances ====================
Derived instances:
instance GHC.Base.Functor Main.List where
GHC.Base.fmap f_a1tg Main.EmptyList = Main.EmptyList
GHC.Base.fmap f_a1th (Main.ListElement a1_a1ti a2_a1tj)
= Main.ListElement (f_a1th a1_a1ti) (GHC.Base.fmap f_a1th a2_a1tj)

Almost like magic (actually, with an algorithm), the compiler generates a Functor instance all by itself. Although the code looks a bit obscure, it is identical to what we wrote ourselves above, just a tiny bit less readable. The first line is the same as ours, except this one is fully scoped. Whereas we take Functor for granted in our own code, because it's an identifier that is either part of the Prelude or one which we've explicitly imported, the compiler specifies the entire chain of dependencies for everything. So Functor becomes GHC.Base.Functor and our List becomes Main.List, since the file where List lives is considered the Main module. In the following three lines, GHC.Base.fmap is likewise the same as the bare fmap in the instance we derived, and Main precedes both constructors, EmptyList and ListElement. The various variables that begin with f_ are congruent with the f variables we use to represent the first argument to fmap—obviously, the compiler follows the same naming conventions we do! The rest of the pattern should be straightforward enough to disentangle:

GHC.Base.fmap f_a1tg Main.EmptyList = Main.EmptyList

is the same as

fmap f EmptyList = EmptyList

and

GHC.Base.fmap f_a1th (Main.ListElement a1_a1ti a2_a1tj) = Main.ListElement (f_a1th a1_a1ti) (GHC.Base.fmap f_a1th a2_a1tj)

is the same as

fmap f (ListElement x xs) = ListElement (f x) (fmap f xs)

You may have to squint, but you can see it: the compiler agrees with us. This trick won’t work for every typeclass, since some of them allow for more than one possible instance, but it’s worth knowing about for when it does — and, for this reason, worth learning the basic ins and outs of the compiler, too.

Now, if we want to be pedantic (and who doesn’t want to be pedantic), we can replace the parameter f in the base case of our newly compiler-approved instance with a wildcard, because we only really care about the pattern match on EmptyList:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap _ EmptyList = EmptyList
fmap f (ListElement x xs) = ListElement (f x) (fmap f xs)

And that still works. We can also mess around with the infix version of fmap to demonstrate how coding style is a matter of taste:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap _ EmptyList = EmptyList
fmap f (ListElement x xs) = ListElement (f x) (f <$> xs)

And why not practice using where and let while we're at it? Here's a version with a where declaration:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap _ EmptyList = EmptyList
fmap f (ListElement x xs) = ListElement h t
where h = f x
t = f <$> xs

And here’s one with a let expression:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap _ EmptyList = EmptyList
fmap f (ListElement x xs) =
let h = f x
t = f <$> xs in
ListElement h t

Early on when learning Haskell, it pays for us to review the difference between the two, so we can just go ahead and invent opportunities for doing so. The clue is in the difference between an expression, which is part of an equation and can therefore be reduced, and a declaration, which creates one or more bindings in a separate scope.

It’s also worth reviewing correct indentation:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap _ EmptyList = EmptyList
fmap f (ListElement x xs) = let h = f x
t = f <$> xs in
ListElement h t

That will work. And so will this:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap _ EmptyList = EmptyList
fmap f (ListElement x xs) = let h = f x
t = f <$> xs in
ListElement h t

And this:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap _ EmptyList = EmptyList
fmap f (ListElement x xs) = ListElement h t where
h = f x
t = f <$> xs

And also this:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap _ EmptyList = EmptyList
fmap f (ListElement x xs) = ListElement h t where h = f x
t = f <$> xs

This will not:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap _ EmptyList = EmptyList
fmap f (ListElement x xs) = let h = f x
t = f <$> xs in ListElement h t

And neither will this:

instance Functor List where
fmap :: (a -> b) -> List a -> List b
fmap _ EmptyList = EmptyList
fmap f (ListElement x xs) = ListElement h t
where h = f x
t = f <$> xs

Each of the last two examples will elicit a parse error. The error itself is vague, but it will tell you where exactly you went wrong, so you’ll at least know that syntax is the culprit. Indeed, your best bet, when learning syntax, is to learn a clean and correct style, too, so your code not only compiles but is legible and feels familiar to the broader developer community. Which style looks best to you?

Finally, we can test our implementation against the functor laws — identity and composition — to make sure that, even though it compiles, it is actually correct:

> testList = ListElement 1 (ListElement 2 (ListElement 3 EmptyList))
> f = (+1)
> g = (div 2)
> fmap id testList == id testList
True
> fmap (f . g) testList == (fmap f . fmap g) testList
True

It works! Probably! But we only tried a few inputs that we chose ourselves. That doesn’t provide for very much confidence. But this is pretty much what a typical unit test looks like, right? The vague assurance provided by a statistically insignificant test input or set of inputs may be good enough for some developers working in some languages, but we’re Haskellers now, and we can do better. We have QuickCheck:

import Test.QuickCheck

data List a = EmptyList | ListElement a (List a)
deriving Show

instance Functor List where
fmap f EmptyList = EmptyList
fmap f (ListElement x xs) = ListElement (f x) (fmap f xs)

instance Arbitrary a => Arbitrary (List a) where
arbitrary = do
a <- arbitrary
b <- arbitrary
frequency [(1, return EmptyList),
(3, return $ ListElement a (ListElement b EmptyList))]

functorIdentity :: (Functor f, Eq (f a)) => f a -> Bool
functorIdentity f = fmap id f == f

functorCompose :: (Eq (f c), Functor f) => (a -> b) -> (b -> c) -> f a -> Bool
functorCompose f g x = (fmap g (fmap f x)) == (fmap (g . f) x)

Without getting too bogged down in details, we have essentially created two functions, functorIdentity and functorCompose, to test whether a given functor f satisfies the laws of identity and composition. We also created an instance of Arbitrary for our custom List so QuickCheck knows how to generate random values of that type. In this case, we're arbitrarily generating either empty lists or lists of two elements, the latter three times as frequently as the former. This is not the best way to test this sort of data structure, but it's fine for demonstration purposes. Let's fire it up:

> quickCheck $ \x -> functorIdentity (x :: List Int)
+++ OK, passed 100 tests.
> quickCheck $ \x -> functorCompose (+2) (*5) (x :: List Int)
+++ OK, passed 100 tests.

Note that to create working tests, we have to make List a a concrete type, otherwise QuickCheck won't know what values to generate. By specifying the type of the input x as List Int, we are able to pass the identity test. Doing the same and making up some function arguments, we can pass the composition test. These tests are a little contrived, but they do demonstrate that, in principle, our List instance obeys the functor laws. Using the checkers library, we can do better still. We just need to write a little more code:

import Test.QuickCheck
import Test.QuickCheck.Checkers
import Test.QuickCheck.Classes

data List a = EmptyList | ListElement a (List a)
deriving (Eq, Show)

instance Functor List where
fmap f EmptyList = EmptyList
fmap f (ListElement x xs) = ListElement (f x) (fmap f xs)

instance Arbitrary a => Arbitrary (List a) where
arbitrary = do
a <- arbitrary
b <- arbitrary
frequency [(1, return EmptyList),
(3, return $ ListElement a (ListElement b EmptyList))]

instance Eq a => EqProp (List a) where (=-=) = eq

test :: IO ()
test = let trigger = undefined :: List (String, Int, Int) in
quickBatch (functor trigger)

All we added was a few import statements, a derived Eq instance for List, an instance for List of a special typeclass called EqProp, and a simple test function. EqProp provides a means to customize how values are compared for testing purposes. This is especially useful for instances, such as Applicative, that might generate infinite data structures if you don't customize their testing apparatus. For Functor, this will do. The test function entails certain idiosyncrasies of checkers that aren't worth getting into here. The important point is that we're checking List against the laws of Functor and having the library generate random data for doing so. Let's try it out:

> test

functor:
identity: +++ OK, passed 500 tests.
compose: +++ OK, passed 500 tests.

Ship it.

Having learned, through this exercise, a bit more about how the type system works, it should be a trivial matter to invent random types yourself and then try to write various implementations for them. For homework, you can try writing Functor instances for these:

data F a b c = F a (b c)

data G a b c d e = G (a b c) (d e) e e

data A b c d e f g = A (b (c (d e) f) f) f

data Q o p r = Q o (p -> r)

data Greek a b c = Alpha | Beta b | Gamma a b c

Try not to cheat! If you get stuck, at least cheat using the compiler and type checker to help you. And if the above problems were too easy, what other instances can you write for these fictional datatypes? Can you test them? For the truly desperate, you may refer to the code for this article. I haven’t included every possibility, but you can use it for inspiration.

Many people claim that there are different learning styles. Personally, I only know one: practice, practice, practice. The osmotic methodology is more entertaining, but if you actually want to learn Haskell, you need to roll up your sleeves and do the dirty work. You’ll be saving yourself effort down the line if you invest more in your education upfront, at the beginning of your Haskell journey.

--

--

Steven Syrek

High functioning, mostly functional programmer. Words here, codes at github.com/sjsyrek.