Five Minutes to Monoid

A “monoid” has an identity value, like 0 for addition and 1 for multiplication:

0 + value = value

1 * value = value

Naturally, identity goes both ways:

value + 0 = value

value * 1 = value

A monoid also has an associative operation. You probably learned about the associative property of addition and multiplication when you were ten years old. It just means that the order in which you perform the addition or multiplication of a series of numbers doesn’t matter:

(0 + 1) + 2 = 0 + (1 + 2)

(1 * 2) * 3 = 1 * (2 * 3)

A monoid extends these two ideas, identity and association, to any thing/entity/object for which you can imagine an identity value and an associative operation. A list, or array, is a fine example of a monoid from the world of programming languages. The identity of a list is an empty list. The associative operation is appending. Just as any number added to zero or multiplied by one is the same number, when you append an empty list to a list, you get back the original list:

[] ++ [1,2,3] = [1,2,3]

[1,2,3] ++ [] = [1,2,3]

[1,2,3] ++ [1,2,3] = [1,2,3,1,2,3]

Voilà. In a sense, you’re just defining what identity and append mean for any given monoid. “Appending” to a number can mean either addition or multiplication. Appending to a list means adding zero or more values onto the end of the list. Strings work the same way as lists:

"abc" ++ "" = "abc"
"" ++ "abc" = "abc"
"abc" ++ "xyz" = "abcxyz"

++ is the Haskell operator for appending to lists. I think it’s much nicer than having to type [1,2,3].append([1,2,3]) or some equivalent. And it allows for a more readable nesting of consecutive operations:

[1,2,3] ++ [4,5,6] ++ [7,8,9] = [1,2,3,4,5,6,7,8,9]

The monoid of lists is defined in Haskell more or less like this:

instance Monoid [a] where
mempty = []
mappend = (++)

mempty is Haskellese for the identity value: “monoid empty.” mappend is the associative operation: “monoid append.” Naming things is hard.

Haskell also provides some other interesting monoids for us out of the box:

instance Monoid All where
mempty = All True
mappend (All x) (All y) = All (x && y)

instance Monoid Any where
mempty = Any False
mappend (Any x) (Any y) = Any (x || y)

These are handy boolean monoids for expressing conjunction and disjunction. All is useful for determining whether all of a given set of values are true, like a boolean and (here &&), and Any determines whether any of them are true, like a boolean or (||):

> All True <> All False
All {getAll = False}
> All True <> All True
All {getAll = True}
> Any True <> Any False
Any {getAny = True}
> Any True <> Any True
All {getAll = True}
> Any True <> Any False <> Any False <> Any False
Any {getAny = True}

The <> symbol is an alias for mappend. It makes it a little easier to chain appending operations together. Since mappend is a binary operation like + or *, why not use a binary operator here, too? Note also that the results of evaluating the monoidal expressions above are packaged up in curly braces as a matter of syntactic convention. Actually, we need this packaging, because it would be otherwise difficult to distinguish among different monoids for the same types of values. For example, sums are different from products for numbers, so we can’t just use addition and multiplication monoidally without specifying which monoid we want to use. We need to supply this information by using their respective monoidal wrappers:

> Sum 1 <> Sum 2
Sum {getSum = 3}
> getProduct (Product 2 <> Product 2)

As you can see, getSum and getProduct are just accessor functions, and you can use them to extract the values from Sum and Product, if necessary.

Where monoids come in handy in programming is in any situation where you need to “fold over” or accumulate values into a final sum value, whatever “sum” means in the given context. This operation, called a “fold” in Haskell, is also called “reduce” in some other languages. Using the library function foldMap, we can simply supply a monoidal type and a list of values (more properly, any kind of value that is “foldable”), and the return result is their “sum” or final, accumulated value, as determined by the implementation of whichever monoid we use:

> foldMap Sum [1,2,3]
Sum {getSum = 6}
> foldMap Product [1,2,3]
Product {getProduct = 6}
> foldMap All [True, True, True]
All {getAll = True}
> foldMap Any [False, False, True]
Any {getAny = True}

As long as the type of value we are folding over is a monoid, then the fold will work as expected, and that’s a nice guarantee to have. If you want to understand what’s happening here in detail, go learn Haskell. Essentially, in the cases above, we’re asking for the final value after applying the associative operation of each monoid to every item in a list, starting with the first two and using the respective identity, or mempty, value as the base case of a recursion.

There are still other monoids worth knowing about:

> Just [1,2,3] <> Nothing
Just [1,2,3]
> Just [] <> Just [1,2,3]
Just [1,2,3]

Just and Nothing are the constructors for the Maybe type, which encodes the possibility that an expected value may be absent. Since Maybe has a Monoid implementation, we can mappend the values contained in a Just, if those values are monoids. Nothing serves as the identity in this case.

> First (Just 1) <> First (Just 2)
First {getFirst = Just 1}
> Last (Just 1) <> Last (Just 2)
Last {getLast = Just 2}
> foldMap First [Just True, Nothing, Just False]
First {getFirst = Just True}
> foldMap Last [Just True, Nothing, Just False]
Last {getLast = Just False}

First is a monoid that, given a series of Maybe values, captures the first one that is a Just and not Nothing. Last, predictably, does the opposite. Using foldMap, we can also apply these monoidal constructors to lists of Maybe values.

> Dual [1,2,3] <> Dual [4,5,6]
Dual {getDual = [4,5,6,1,2,3]}
> getDual $ foldMap (Dual . First) [Just True, Nothing, Just False]
First {getFirst = Just False}
> getDual $ foldMap (Dual . Last) [Just True, Nothing, Just False]
Last {getLast = Just True}

This last example is a little strange and not often used. Dual flips the arguments of its corresponding mappend. As you can see, it reverses the functionality of First and Last and, like the monoid for Maybe, also mandates that the values it contains be monoids. Monoids within monoids: they’re a beautiful thing.

Now that you know all about monoids, here’s a bit more vocabulary to help you through those awkward conversations with category theorists: a semigroup is a monoid without an identity, like a non-empty list. An Abelian monoid is a monoid that is also commutative, like addition and multiplication (but not lists or strings, for which order is significant). You can also just say commutative monoid or, if you prefer, not talk about them at all. But do spread the word about monoids. They are, quite literally, everywhere—whether you take advantage of their useful properties or not. Obviously, you should.