Demistifying the Monad in Scala, Part 2: A Category Theory Approach

Some of you may have read my article on monads. This time I want to talk about monads from a different, more theoretical point of view. It is not mandatory to read the previous article first (in case you haven’t), but if you don’t know anything about the topic, I guess it couldn’t hurt to become familiar with it before jumping on this one.

Let’s start. Some of you may have heard the popular quote:

A monad is just a monoid in the category of endofunctors, what’s the problem?

Original statement is from Categories for the Working Mathematician by Saunders Mac Lane, but it has been quoted and rephrased numerous times. I took this statement as a reference point; a teaching objective, if you will. So I will attempt to explain monads from the category theory point of view by explaining that particular sentence.

Important note: A lot of stuff written here is simplified and may not be 100% mathematically accurate. Category theory is a very complex field of mathematics (sometimes even jokingly referred to as “abstract nonsense”). 
This is not a math article. This is simply my attempt of creating an introductory, easy-to-follow text that should serve as a starting point to get you all excited about category theory and continue the exploration on your own.

Algebraic structures

What is a category? A category is an algebraic structure. OK, but what is an algebraic structure? Algebraic structure is basically a set (when used in this context of being an algebraic structure’s set, it is commonly referred to as underlying set) of one or more elements with one or more operations on them.

There are two main things that differentiate algebraic structures from one another:

  • laws that need to hold for operations on set elements
  • existence of identity element

Here are some laws that appear across different algebraic structures (operation in an algebraic structure is denoted as ○):

  • Closure: For all a, b in algebraic structure M, the result of operation ab is also in M
  • Invertibility: For each element a in M there is an element b where ab = ba = e , where e is the identity
  • Commutativity: for all a, b in M: a ○ b = b ○ a
  • Associativity: for all a, b, c in M: (a ○ b) ○ c = a ○ (b ○ c)

Identity element is an element which (quoting the wiki) “leaves other elements unchanged when combined with them”. It’s the element that doesn’t affect the result when used in operation ○. We can formulate that into a law too:

  • Identity: for any a in M there is an element e where e a = a e = a

So, starting from a simple set with one or more operations and by adding a certain combination of the laws that are required, we can end up with various algebraic structures. Here are some:

Category

Category is an algebraic structure that consists of objects with defined mappings between some of them. Those mappings are called morphisms or arrows. Arrows go from one object to another. Some pairs of objects have an arrow between them defined, some don’t.

Categories have an important feature: if there is some arrow f: A → B and an arrow g: B → C, there is also automatically a composition of arrows f and g which forms an arrow A → C. We write the composition as g ○ f. Recall that algebraic structure is defined as a set with some operation(s) on its elements; in case of category, those elements are arrows and the operation is arrow composition.

Categories are completely abstract and this is quite tricky to wrap your mind around. “Give me an example of a category”, you ask. OK, here’s one — Set. Set is the category of sets. Which sets? Does it include the set of integers, set of letters in english alphabet, set of all animals with four legs? Yes, it includes any and every set there is. It’s not easy, but try not to think in completely concrete terms. Don’t try to materialise those sets. We don’t care how many sets are inside the category or which elements are inside those sets. We just care about the fact that objects in the Set category are sets and arrows are functions between sets that can be composed. I mean, that’s not all, there is a whole science behind category of sets with lots of axioms and laws. But we have all we need for now.

Regarding our algebraic structure laws, category obeys two: identity and associativity. Since category is an algebraic structure with arrows as its elements and arrow composition as the operation between the elements, following those laws means that we have an identity arrow and that composition of arrows is associative.

Identity arrow is an arrow that goes from an object to itself. Composition of any arrow f with identity arrow e is arrow f:

e ○ f = f = f ○ e

Given arrows f, g, h and their composition , the associativity law is:

h ○ (g ○ f) = (h ○ g) ○ f

Functions in your favourite programming language are another example of a category. Objects in this category are types and morphisms are functions that take a type and return a type. For any type there exists a function that returns that same type (identity) and functions compose easily, obeying associativity law. Note that functions of two or more arguments can be curried and therefore viewed as functions of one argument that return a higher-order function.

That’s it for now regarding categories. Objects, arrows, composition, identity, associativity. Good. We’ll get back to them shortly.

Monoid

Getting warmer. What is a monoid?

Monoid can be defined as a:

- single set S 
- with an associative binary operation
- and an identity element e
-
following two laws:
 (a ○ b) ○ c = a ○ (b ○ c)
 a ○ e = e ○ a = a

Note how similar this looks to a category. We will soon see what this means.

But first an example for a monoid. OK, so we need a set with an associative binary operation and with some identity element. We could take addition on the set of integers Z. Alright, we have an identity element (it’s zero) and we have the associativity, because (x + y) + z= x + (y + z). If you take another look at that table of laws for various algebraic structures, you’ll notice that addition in Z is in fact a group, because it also has an inverse (subtraction).

But here’s something cool and possibly a bit disturbing at the same time — monoid is actually a category, just a bit special one: it has only one object. If you scroll back to the table we saw earlier you’ll see that monoid is actually pretty similar to a category, only difference being the closure law. Recall that closure says “for all a, b in algebraic structure M, the result of operation ab is also in M”. It’s pretty logical we got the closure property in a monoid since we always start from and end up in the same object (the only one we have). Categories don’t have closure because not all arrows can necessarily be composed (e.g. you can’t compose a → b and c → d) so the property “combine any two elements and you’ll arrive at something that’s also a part of the set” doesn’t hold. Other two laws (identity and associativity) are the same.

When switching the viewpoint from monoid as a set to monoid as a category, elements of the set become arrows of the category, identity element becomes the identity arrow, and binary operation becomes arrow composition. And yes, as soon as we enter the twilight zone also known as category theory, stuff gets weird. But with a bit of mindset shifting, our addition-in-Z example can be seen as a category.

First of all, numbers are now arrows. Think of an arrow as an “add X” function. These functions compose quite nicely, since adding 3 to a number and adding 5 to the result is the same as adding 8 to the original number. Furthermore, composing add3 with add5 followed by composing with add7 is the same as composing add3 with the composition of add5 and add7:

(add3 ○ add5) ○ add7 = add3 ○ (add5 ○ add7)

To cite a particular great source on category theory from which I took the adders example:

Instead of giving you the traditional rules of addition, I could as well give you the rules of composing adders, without any loss of information.

So, by shifting our viewpoint a little, we were able to see the monoid both as a set (on the left) and a single-object category (on the right).

(I took the photo from here)

Also, all the laws that we had in the first case are also present in the second case. Nothing more, nothing less.

OK, so we no longer have numbers; we have add-arrows which represent single-parameter adding functions. Second operand of adding operation is “hardcoded” (operation is now unary, not binary) so we have one function for each natural number. Total count of natural numbers and total count of arrows in our category are the same. Finally, arrows all share the same starting and ending object. This object is completely irrelevant; think of it as a massless blob of nothingness. If you want, we could rename arrows from “addX’ to simply “x” and then the whole thing becomes really similar to our normal good old set Z with addition, just instead of “adding two and two equals four” we would say “composing two and two equals four”.

Here are the wikis for both viewpoints, perhaps you’ll find them useful:

Functor

Alright, now we know what algebraic structures are and how they are defined, and we also took a closer look into two of them — category and monoid. We saw that monoid is a category too, just with only one object.

What we need now is a functor (by the way, I already wrote a bit about functors too).

Compared to monoids and categories, functor is a bit different. It is not really an algebraic structure; it’s more of a function that maps from algebraic structures to algebraic structures. But not just any algebraic structures: functor is a mapping between categories. Category, as we know by now, is a set of objects with arrows between them. Well, functor knows how to map the category to another category. It will map all the objects and all the arrows, and it will preserve the connections between objects (if there is an arrow between a and b in original category, there will be one between Fa and Fb in the resulting category). Given some object a or some arrow a → b from the original category, corresponding object (the one functor maps into) is denoted as F(a) and corresponding arrow is denoted as F(a) → F(b).

Functor that maps a category back to that same category is called an endofunctor. Quick detour from the category theory into the real world — in programming, we deal with functors in category of types, and they are all endofunctors. They map a category of types back to category of types. Whenever you mapped something (map in Scala, fmap in Haskell) you had an endofunctor in your hands. For example, Option, List and Future are all valid endofunctors. Basically anything that has a map() function is an endofunctor. Don’t take this for granted though; it’s merely a convention. Some library may provide you with an object whose map() doesn’t obey the functor laws (see the old article) which would mean that the object itself cannot be considered a functor.

Example: in Scala we can map from List[Int] to List[String] (in other words, to map all elements of a List from Int to String):

List(1, 2, 3).map((x: Int) => x.toString)

We can have all kinds of mappings: from Int to String, from String to List[String], from List[String] to Banana etc. All of them are Scala types. This means we always map from a category of Scala types back to the category of Scala types. Hence the “endo” part.

Now let’s “rip out” the map() method from some functor F and view it as a function of two arguments — a functor object and a function which we map it with. For example, List(1, 2, 3).map(f) becomes map(f, List(1, 2, 3)).

If we curry that, it’s easy to see that signature of map is:

(a → b) → F[a] → F[b]

Function which we map with is denoted as a → b, starting functor object is F[a] and result functor is F[b]. Note that instead of providing both arguments, function a → b (e.g. (x: Int) => _.toString()) and a functor instance F[a] (e.g. List(1, 2, 3)), we can provide only the function f, in which case we get back a function F[a] → F[b] (if this is not something you’re comfortable with, do a quick research on currying and partially applied functions). So yeah, instead of providing both arguments of type A and B and getting back a value of type C, we can provide only parameter of type A and get back a function B C. Providing only number 4 to a function which sums two numbers gives us back a function that takes some number n and returns n+4.

This is just one half of being a functor: a function on functions. It takes a function and returns the “lifted” version of that function, that is, one that instead of taking a and returning b now takes F[a] and returns F[b]. In practice we often don’t just lift functions and save them for later. Instead we immediately apply them to a functor instance. For example, instead of lifting a function Int → String to List[Int] → List[String], we usually pass in an instance of List[Int] right away (or, in case of Scala, we invoke map() as a method on an instance of List[Int]).

There is also another side to being a functor: apart from the function on functions there is also a (somewhat implicit) function on types. We know that functor maps some source category to some destination category, which means it maps all source objects to destination objects and all source arrows to destination arrows, right? Well, functor’s mapping of arrows is represented by the function-on-functions part, while mapping of objects is represented by the function-on-types part.

What is this function on types anyway? To be more precise, it’s actually a unary type constructor. This means that, given some type, functor produces another new type. So if we take List functor as an example, given a String it gives us a List[String]. Given some type A, it gives us a List[A]. Type constructor for functors can be described in a more general way as * → * (takes one type as a parameter and produces a new type from it). Here’s some more useful reading on type constructors.

Monad

Now that we know what category, monoid and endofunctor are, we can imagine a category of endofunctors and try to find a monoid in that category. As the famous statement by mr. Mac Lane tells us, monoid in the category of endofunctors is in fact a monad.

What does it mean to be a monoid in some category? For example, a monoid in the category Set (remember, that’s the category of all sets). What is it? Well, take a look at the category; it contains all imaginable sets. Now pick out those that satisfy monoid laws, that is, pick out all those sets for whose elements we can find an identity element and define an associative binary operation. One such set is the set of integers with identity element being zero and binary operation being addition. Note that monoidal binary operation on set S must operate on two operands, both from S, and return something that is also an element of S. Cool, so a monoid in the category Set is any set that has those properties, that is, those two operations (for example, the set of natural numbers with addition). Here are some more examples of monoids in the category of X.

Let’s now see about “monoid in the category of endofunctors”. In this category, objects are endofunctors and arrows are mappings between those functors. An extra bit of terminology before we continue: mappings between functors are called natural transformations. They operate on a yet higher level of abstraction:

  • arrows map objects within a category from one to another
  • functors map categories from one to another
  • natural transformations map functors from one to another

We could also say that a natural transformation is an “arrow between functors”, and that’s exactly what we have here — we have a category of endofunctors, which means that objects are endofunctors themselves, and arrows between those endofunctors are by definition natural transformations.

Back to our search for monads. So what we are doing is reaching into the category of endofunctors and looking for such objects of that category for which two particular arrows are defined — identity and associative binary operation. Let’s examine those two operations into more detail:

  • Identity is a natural transformation mapping from an identity endofunctor (a functor which maps a category to itself) to another endofunctor.
  • Asssociative binary operation is some operation x that knows how to take two endofunctors and turn them into one. Keep in mind that this operation must be associative, so (F x F) x F must yield the same result as F x (F x F).

Let’s write that down. We need an endofunctor F for which the following operations are defined:

  • η: I → F
  • μ: F x F → F

where I is the identity endofunctor, η (eta) is the fancy math name for identity on endofunctors, and μ (mu) is the fancy math name for associative binary operation on endofunctors. It’s hard to intuitively understand where we got these particular expressions from; on wiki you can see one explanation, but I’m warning you — it’s pretty convoluted. However, by the end of the article you will start seeing the pattern between this and a monoid. This great answer on SO could also help. But let’s go on for now, you can revisit those links later.

First, η: I → F. Let’s see… Identity functor I maps some category x back to itself. That means it doesn’t really do anything — functor I for category x is basically just x. So what η really does is — take some object x and wrap it into a functor F context. What’s functor F? We’re working with types now, and if you recall from earlier, functor can be viewed as a type constructor. So if F is, say, a List, then identity takes an object x and passes it into the List[_] type constructor, yielding a List(x).

What about μ: F x F → F? It’s a composition of two type constructors; you start from some type x, pass it to F type constructor, get back an F(x) and then pass that again to another F. This results in type F[F[x]]. But note that μ requires that the result is of type F[x], not F[F[x]]. We have to take care of that.

Based on what we just said, it’s pretty logical to have apply as η and flatten as μ. Method apply constructs a type F[x] , while flatten “flattens” the composition F[F[x]] into F[x].

Let’s see it on the List example. List is a monad with η and μ implemented as apply and flatten (recall that there’s syntax sugar that allows us to just write e.g. List() which is de-sugared into List.apply(); also note how associative binary operation is a composition of the functor type constructors):

  • η: x → List(x)
  • μ: List(List(x)) → List(x)

So whenever you see an endofunctor in Scala (in practice this is almost any object with method map, but remember to check the functor laws before you declare something a functor) that has apply and flatten, you are in fact dealing with a monad.

Three monad definitions and monad laws

Alright. We saw that monad is an endofunctor (remember, any functor in a programming language is actually an endofunctor because it maps from the category of types back to category of types) with two natural transformations: identity/unit implemented as apply and associative binary operation implemented as flatten (note that terms unit and identity are used interchangeably; I tend to use unit when talking from a practical, programming viewpoint, and identity when talking about category theory, but sometimes I may mix them up; they’re the same thing). Also, since we are talking about an (endo)functor, we know that there is a map method available too.

You may have heard about monad consisting of unit and flatMap. Yep, that’s correct; we can simply “squash” the flatten (coming from μ) and map (coming from its functor nature) methods into a single flatMap method. Our monad would still retain the same properties, it’s just that its associative binary operation would no longer be flatten, but flatMap. Monad laws (explained later) are not harmed.

So there are two valid definitions of a monad? One that involves unit and flatMap and one that involves unit, flatten and map?

Yes. And not just that — there’s also a third one. Monads can be expressed in three different ways. All three definitions are equivalent and we can easily transform one into another. Here they are:

  • unit + flatMap
  • unit + flatten + map
  • unit + compose

All three are equally powerful and each one can be expressed by using one of the other two. We’ll talk more about them later.

Now, every monad implementation (e.g. List, Option etc.) needs to obey the monad laws. Monad laws can be expressed in three different ways, depending on which monad definition you are using, but they all resemble the same core concept.

Here are the monad laws presented using the unit + flatMap definition:

  • left-identity law
    unit(x).flatMap(f) == f(x)
  • right-identity law
    m.flatMap(unit) == m
  • associativity law:
    m.flatMap(f).flatMap(g) == m.flatMap(x ⇒ f(x).flatMap(g))

It is obvious that unit + flatMap definition is easily transformed to unit + flatten + map and vice versa because flatMap = flatten + map. I will now show the connection between unit + flatMap and unit + compose. I like the unit + compose definition because it makes the laws easier to express.

Let’s see the compose function:

def compose: (A => F[B]) => (B => F[C]) => A => F[C]

This is a two-parameter function where both parameters are functions of type A → F[B], and the result is also a function of the same type. Note that types A and B are completely free and may represent any concrete types in Scala (A→F[B] could be e.g. Int → List[String] or Int → Set[Int]). We use the letters simply to denote that the parameter of the first function is the same type as the parameter of the result function (here denoted as “A”). By the way, functions of type A F[B] are called Kleisli arrows, just in case you stumble upon the term somewhere. We say that the function we just defined has Kleisli arrows as parameters and also as return type. It’s a composition of Kleisli arrows.

Now let’s also define “unit”:

def unit: A => F[A]

We can easily check that this is indeed unit: it’s a neutral element that, when composed with, gives back the original element. Just take the signature and replace F[C] with F[B] so that the second parameter becomes an identity function:

(A => F[B]) => (B => F[B]) => A => F[B]

Again, types A, B and C are not fixed and can represent any concrete type so A → F[A] from the first identity expression is the same function as B → F[B] in this expression. You can think of identity function not as A F[A], but as Whatever F[Whatever] if you wish:

(A => F[Whatever]) => (Whatever => F[Whatever]) => A => F[Whatever]

So if we apply identity function as a second operand to compose, we will arrive back at our first operand. Good, composing identity with some function results in the same function. Identity holds. Associativity is pretty clear too; I will leave it to you to try and prove it by yourself.

What I wanted to show you is that compose can be expressed by using flatMap. Same goes in the other direction too; flatMap can be expressed by using compose.

Here it is:

trait monad[F[_]] {
  def flatMap[A, B](fb: F[A], f: A => F[B]): F[B] = ???
  def compose[A, B, C](f1: A => F[B], f2: B => F[C]): A => F[C] = {
a => flatMap[B, C](f1(a), f2)
}
}

And the other direction:

trait monad[F[_]] {

def compose[A, B, C](f1: A => F[B], f2: B => F[C]): A => F[C] = ???

def flatMap[A, B](fb: F[A], f: A => F[B]): F[B] = {
compose[Unit, A, B]((u: Unit) => fb, f)()
}
}

As I said earlier, unit + compose is particularly great for one specific purpose, and that is expressing monad laws. They become simpler to read and understand. Here they are again, this time using compose instead of flatMap:

  • left-identity law
    unit.compose(f) == f
  • right-identity law
    f.compose(unit) == f
  • associativity law:
    f.compose(g.compose(h)) == (f.compose(g)).compose(h)

Remember that all three definitions are equally “good”, that is, all three are minimal sets of operations needed to define a monad and each one can be expressed by using one of the other two.

Always remember that monad laws are the essence of what makes monads what they are. Having monad operations (e.g. unit + flatMap) is not enough if the laws are not satisfied.

Summary

Our initial objective is complete. We now understand that famous sentence. Monoid in the category of endofunctors is any endofunctor with operations η and μ, and we call such endofunctor a monad (reminder: objects of that category are endofunctors and arrows are natural transformations).

So, monad can be defined in many ways, such as:

  • monoid in the category of endofunctors
  • object in the category of endofunctors with arrows η and μ
  • endofunctor with natural transformations η and μ

In Scala, η is implemented as apply and μ is implemented as flatten, which means that monad is any functor (that is, a construct with map method) that is additionally equipped with apply and flatten. There are two more completely equally valid ways of defining a monad in terms of its operations: unit + flatMap and unit + compose. They are completely equal and there is nothing one can do that others can’t. We saw how flatMap can be expressed using compose and vice versa; this is possible for all combinations.

And don’t forget about the monad laws.

Final word

Hopefully this article helped you gain some perception as to how certain category theory constructs fit together and, more specifically, what our old friends monads “really are” — monoids in the category of endofunctors.

By the way, name monad comes from “monoid” and “triad”; monoid because it’s a monoid in the category of endofunctors, and triad because it’s a package of three things: an endofunctor equipped with two natural transformations.

That’s all for now. As usual, if you find any mistakes please do let me know via email (sinisalouc@gmail.com). You can also find me on Twitter.

Cheers!