In this article, I’ll describe a cool concept of functional programming, type classes, on the example of the Haskell programming language. I’ll start by providing some Java examples and go through the limitations of the Java language and its object-oriented paradigm. I will then describe how those limitations are lifted when moving into the functional programming world.
The article is targeted mostly at Java developers, but knowledge of any object-oriented language will do. You don’t need to have any prior experience in Haskell before you read the article. However, I would be happy if you become interested in this wonderful language, or functional programming in general, after reading the article. This might help you build a bigger picture of what you’re missing out on, and better understand how the types in different programming languages work.
Interfaces in Java and their limitations
An interface is a ubiquitous concept in Java. You define a set of function signatures and then implement them in a particular class definition:
But what if you want an interface that is more abstract? For instance, there are a lot of types with the following three properties which beg to be extracted to an interface:
First, values of some types can be appended to get another value of the same type:
- two strings may be concatenated to get a new string;
- two integers may be added to get a new integer value;
- two lambda functions may be composed using
Function.compose()which gives us another function — a composition of the two.
Second, all those operations are associative: if you have more than two values combined in succession, you could change the order of combinations using brackets, and it doesn’t affect the result:
1 + (2 + 3)is the same as
(1 + 2) + 3— both are equal to
("hello" + " beautiful ") + "world"is the same as
"hello" + (" beautiful " + "world")— both are equal to
"hello beautiful world"
fun1.compose(fun2.compose(fun3))is the same as
fun1.compose(fun2).compose(fun3)— these are just three functions applied in succession to the output of each other.
And third, there is a special value that, when combined with another one, doesn’t change it. Let’s call it an “empty” value:
0 + 1is the same as
1 + 0which is just
"" + "hello"is the same as
"hello" + ""which is just
fun1.compose(Function.identity())is the same as
Function.identity().compose(fun1)which is just
We see quite a lot in common, so can we make all those types implement some kind of an interface? In algebra, a set with a binary operation and a special value with properties shown above is called a “monoid”, so why don’t we declare a
Monoid interface for all those types:
empty() method returns the special “empty” element, and the
append() method combines this value with another one and returns a new, combined value. Here’s how an implementation would look like for the
String type if we could change its definition in the standard library:
Neat! Now we could write generic code that could work with any monoids, whatever their real types are. For instance, we could write a method that appends multiple monoid values of the same type into one, using
Note that this variant of the reduce method uses the first list element as the initial value, and then appends all other values to it using
append. This is why the method returns
Optional<T> — it’s required for the case when the list is empty (we don’t have any value to use as the initial one), so in this case, the method would return
We could use the
empty() value as the initial one and also cover the case when the list is empty, but we can’t call the
empty() method because it’s an instance method, and if the list is empty, then we don’t have any instance to call it on.
For a list of
Integer values, this method would add them up. For strings, it would make a concatenation of a list of strings. For functions, it would compose a set of functions into one big function. Looks like we can write a simple piece of generic code that is useful for a lot of different cases. However…
Why This Doesn’t Work in Java
First, you can’t add an implementation of an interface to some existing type. At the time of this writing (Java 17 is just around the corner) you can only specify the implemented interfaces in the type definition, which pretty much leaves us waiting for Java 18, 19, 20… or forever.
The second, more conceptual issue: there are different ways of representing the same types as monoids. Here are at least two ways for the integers — one for addition, which we already discussed:
And here’s another one, for multiplication, which also has the same properties: multiplication is associative, and there’s an “empty” element (namely, 1) that doesn’t change the other value after multiplication:
Now we’re completely out of luck. Not only we can’t add an interface for existing types, but we can’t have more than one implementation of an interface for the same type. Java just doesn’t allow it. We’d have to resort to the delegate (or wrapper) pattern which is usually ugly in Java.
How it’s Done in Haskell with Type Classes
Haskell is a functional language, it’s not object-oriented. You can define your data types, but if you compare them to Java classes, they are more like records — they only have data (fields) and no functionality (methods). Here’s how we can use Haskell’s record syntax to define a type with three fields — name, surname (strings), and age (integer value).
But if the type doesn’t contain any logic, how do we abstract over different types in terms of what can be done with them? We define type classes. A type class is a set of function signatures, which looks very similar to an interface in Java.
Let’s look at how our
Monoid interface can be defined in Haskell using the type class syntax (the actual definition from the Haskell library is a little bit more complicated, but the idea is the same):
class keyword in Haskell doesn’t define a new class, like in Java — here, it defines a new type class. Think of it as an interface for now.
In the definition above, we use the
class keyword to define a type class with one type variable m (this is what you would write as
<T> in Java). It defines two functions:
mempty, returning a value of type parameter
mconcat, which takes two values of type
m and produces a third one of the same type
m -> m -> m may look weird, but it’s just a sequence of type variables connected by
-> which is Haskell’s syntax for a function type signature — the last type in the string is always the return type, everything else is function arguments. There’s a reason why it looks like that, but we won’t go into this topic here.
It’s easy to see how this type class corresponds to the Java interface we’ve defined above, with
mempty corresponding to
mappend corresponding to
In Java, we say that a class implements an interface. In Haskell, we say that there’s an instance of a type class for this type. One important difference is that we can define an instance of a type class after the type is already defined.
So how do the type classes solve the second issue with interfaces — that you can only have a single implementation of the interface for one type? Let’s take the addition and multiplication of integers as an example, where you can have two different implementations of the
Monoid interface, but you can’t easily do that in Java. To see how it’s done in Haskell, let’s look at the definitions of types
You can see we use a different keyword to define a type here —
newtype. You can think of it as a lightweight wrapper for an existing type — once it type-checks at compile-time, the original type is used at runtime. The
Sum type has a single property
getSum that returns a wrapped value. Same for the
We can wrap the integer values into those types as simple as:
But what’s so special about those wrappers? They both have a
Monoid type class instance defined for them, according to the properties of multiplication and addition.
Here’s a simplified version of the
Monoid type class definition for the
Product wrapper. In essence, we say that the
Product value can be treated as a
Monoid, and provide the implementations of the
Monoid functions for it:
instance keyword defines a new type class instance. The section
Num a => Monoid (Product a) is somewhat similar to saying
class Product<T extends Number> implements Monoid<T> in Java — it means that the
Product type variable here is bound by the
Num, so this definition only concerns the numbers.
The second and third lines define the
Monoid function implementations for the
mempty is just a value 1 wrapped in
Product, and the
mappend function is defined in infix notation as two values multiplied and again wrapped in
Product. This definition allows us to say something like:
Monoid type class instance for
Sum looks very similar. You see that we can define multiple type class instances for the same type, and we can do this after the initial type is already defined.
Here’s how we could define a function that concatenates multiple monoid values in one, regardless of their actual type:
This is similar to saying
reduce(Monoid::append) in Java, however, here we also can specify the empty monoid value as the initial value, and get a reasonable result even if the set of monoid values is empty.
As you can see, type classes in functional programming are similar to interfaces in object-oriented programming languages, but at the same time they represent a much more powerful concept:
- you can define them for an existing type without changing its signature;
- you can define multiple sets of implementations for the same type.
Hopefully, this article will spark your interest in the Haskell programming language — make it your next programming language to learn!