Programming Languages as Categories

Published in

Statebox

8 min readApr 29, 2019

In our previous posts we explored how to work with categories and functors in Idris, providing definitions that encode all the required properties and laws.

You actually hear a lot of talking about functors in the context of functional programming. Together with other derived concepts such as applicatives and monads, functors are one of the basic abstractions used to structure the flow of programs.

Functors are one of the main abstractions to structure applications in functional programming.

In this post we will explore in detail the link between general category theory and functional programming, showing that the latter is just one example of the former.

Make sure to read it all until the end, where a little surprise is waiting for you.

Simply put, the main reason why category theory is so relevant for programming is that, in a certain sense, any programming language can be seen as a category.

This is quite a bold statement and in fact it is not one hundred percent true, but there is a lot of truth behind it, so let’s try to make it more precise.

To obtain a category from a given programming language we need to choose carefully which are the objects and which are the morphisms.

The objects will always be the types available in the type system of the language. They could be native types, like int or string, or more abstract types like (list int, string -> int). It’s important to note that in a dynamic language, where changing the type of values is allowed, there is only one big type which contains every possible value. On the other hand, in a statically typed language with a very expressive type system we can have a lot of objects in our category. This is why the categorical approach makes much more sense in strongly typed languages.

So, types are objects in our programming language category, but what are morphisms, then? Well, morphisms are the so-called pure functions, that is, functions which are referentially transparent as expressions. This excludes from being a morphism all the (so-called) functions which have side effects (e.g. throw exceptions, make network calls, generate random numbers, …) or which rely on some external state. Therefore in procedural languages many functions which we are used to write can not be taken into account as morphisms of our category.

Luckily enough, there are programming languages where we don’t have the possibility of writing impure functions, and in such a case every possible function corresponds to a morphism in the category.

Idris as a category

In Idris, thanks to dependent types, we have the possibility of encoding the above described category, specialized for the Idris programming language, in Idris itself! Let’s see how we can do it in practice, defining a Category made with Idris types and functions.

We said that objects of our category are types, so the type of our objects will just be Type.

If you recall, the signature of the record field mor which describes morphisms of a category is obj -> obj -> Type, where obj is the type of objects. So in our case we have Type -> Type -> Type; as we said above this needs to encode all the functions between two types, so it will be practically defined as

TypeMorphism : Type -> Type -> Type
TypeMorphism a b = a -> b

Our category is slowly shaping itself, now we need to define identities and composition, but that’s almost trivial, since we are talking about functions

identity : (a : Type) -> TypeMorphism a a
identity a = idcompose :
     (a, b, c : Type)
  -> (f : TypeMorphism a b)
  -> (g : TypeMorphism b c)
  -> TypeMorphism a c
compose a b c f g = g . f

As one could expect, identities and composition of the category are just identity functions and function composition.

The axioms for being a category, left and right identity and associativity, are at this point trivial and they can all be implemented as Refl.

Summing up, we can formally define our category as follows

typesAsCategory : Category
typesAsCategory = MkCategory
  Type
  TypeMorphism
  identity
  compose
  (\a, b, f => Refl)
  (\a, b, f => Refl)
  (\a, b, c, d, f, g, h => Refl)

Functors are functors (part 1)

As soon as we have a category, our instinct should be that of immediately looking for what functors are in that category. For the above category the answer is that functors are, well, functors.

In Idris we already have a concept of functor, which looks as follows

interface Functor (f : Type -> Type) where
  map : (func : a -> b) -> f a -> f binterface Functor f => VerifiedFunctor (f : Type -> Type) where
  functorIdentity : {a : Type}
                 -> (g : a -> a)
                 -> ((v : a) -> g v = v)
                 -> (x : f a)
                 -> map g x = x
  functorComposition : {a : Type}
                    -> {b : Type}
                    -> (x : f a)
                    -> (g1 : a -> b)
                    -> (g2 : b -> c)
                    -> map (g2 . g1) x = (map g2 . map g1) x

We would like to prove that an Idris VerifiedFunctor can be seen as a CFunctor as we defined it, so we would like to implement a function with the following signature

functorToCFunctor :
     VerifiedFunctor f
  -> CFunctor typesAsCategory typesAsCategory
functorToCFunctor {f} func = ?cFunc

The first bits of the implementation are pretty straightforward

functorOnMorphisms :
     VerifiedFunctor f
  -> (a, b : Type)
  -> TypeMorphism a b
  -> TypeMorphism (f a) (f b)
functorOnMorphisms _ _ _ (MkTypeMorphism g) =
  MkTypeMorphism (map g)functorToCFunctor {f} func = MkCFunctor
  f
  (functorOnMorphisms func)
  ?funcPreserveId
  ?funcPreserveComp

As you could probably guess, we use f itself to map objects and the map function provided to us by the Functor instance, to map the morphisms.

Now we are left to show that our definition of functor is lawful, that is, it respects identities and composition. But, as a matter of fact, with our current definitions this is actually not possible; in the next paragraph we’ll explain why and we’ll adapt things to solve the issue.

Intensional and extensional equality

If you try to prove that our functor preserves identity, you’ll find yourself at the point where you need to provide evidence that id : f a -> f a and (map id) : f a -> f a are the same function. It turns out that equality between functions is quite a complicated subject; the main question is: when do we say that two functions are equal?

Pure function can be thought as boxes. Are two boxes equal based on their exterior or their content?

From a mathematical standpoint, if we look at functions as relations, it makes sense to say that two functions f, g : a -> b are equal whenever, for any element x : a, we have f x = g x. This definition of equality is called extensional and it observes only the external behaviour of functions.

From a computational point of view, instead, it may make sense to have a more strict criterion to differentiate between functions. For example, if we consider the functions \x => 2 * x and \x => x + x defined on the natural numbers, do we want to always consider them as the same function? Or, if we consider several different sorting algorithms implementations, do we want to consider them equal as functions?

For a machine it is definitely easier to check whether the abstract syntax trees used to define two functions are equal, but it could turn out very hard to understand whether two functions are extensionally equal, because it would have to compute the functions for every possible value. Therefore in programming the default form of equality between functions, called intensional, concerns the equality of internal definitions.

Coming back to our functor definition, we can now spot where the problem is. Our definition of equality between functions uses intensional equality, while the VerifiedFunctor interface is using extensional equality. Since intensional equality is more fine-grained that the extensional one, we can not hope to adapt the VerifiedFunctor interface to our definition. We will need to slightly change our definitions to use extensional equality.

What we need to do is change our definition of morphisms, so that we can assert that extensional equality holds

record ExtensionalTypeMorphism (a : Type) (b : Type) where
  constructor MkExtensionalTypeMorphism
  func : a -> bpostulate
funExt :
     {f, g : ExtensionalTypeMorphism a b}
  -> ((x : a) -> func f x = func g x)
  -> f = g

We wrap functions in a new type ExtensionalTypeMorphism and we postulate that two elements of this type are equal whenever we have a proof of the extensional equality of the inner functions. This corresponds to quotienting the TypeMorphism type with respect to the equivalence relation given by extensional equality. Moreover, we can use this new type for defining morphisms of a new category which we will call typesAsCategoryExtensional.

Functors are functors (part 2)

We can now come back to prove that VerifiedFunctors can be seen as CFunctors. Let’s resume from where we left off, adjusting things with our newly introduced ExtensionalTypeMorphism and typesAsCategoryExtensional.

functorToCFunctor :
     VerifiedFunctor f
  -> CFunctor typesAsCategoryExtensional    
              typesAsCategoryExtensional
functorToCFunctor {f} func = MkCFunctor
  f
  (functorOnMorphisms func)
  ?funcPreserveId
  ?funcPreserveComp

We can now use our postulate funExt to complete the missing pieces

functorPreserveId :
     (func : VerifiedFunctor f)
  -> (a : Type)
  -> functorOnMorphisms func a a (extIdentity a)
   = extIdentity (f a)
functorPreserveId _ a =
  funExt (\x => functorIdentity {a} id (\v => Refl) x)

Here we use the functorIdentity field of the VerifiedFunctor interface, which gives us the extensional equality between id : f a -> f a and (map id) : f a -> f a. Using funExt we can transform the extensional equality into a concrete equality and complete the proof.

Similarly, we can prove that an Idris functor respects function composition according to our definition.

functorPreserveCompose :
     (func : VerifiedFunctor f)
  -> (a, b, c : Type)
  -> (g : ExtensionalTypeMorphism a b)
  -> (h : ExtensionalTypeMorphism b c)
  -> functorOnMorphisms func a c (extCompose a b c g h)
   = extCompose (f a) (f b) (f c) (functorOnMorphisms func a b g)  
                                  (functorOnMorphisms func b c h)
functorPreserveCompose
  func _ _ _
  (MkExtensionalTypeMorphism g’)
  (MkExtensionalTypeMorphism h’)
  = funExt (\x => functorComposition x g’ h’)

Conclusion

Today we saw how types and functions of a programming language, Idris in our case, can be interpreted as a category. Moreover, we proved that the concept of functor commonly used in functional programming correspond exactly to functors from a categorical point of view. This implies that we can use concepts from category theory, reason with them in their very abstract setting, and then apply what we found out to our software. Or, from another point of view, we can try to write our software thinking just in terms of categories, so that we know that our code can be used not only to talk about programming languages, but has a broader scope and application.

I promised a little surprise at the beginning, so here it is: we released open source our category theory library! You can have a look at the parts of the code which I didn’t have the time and space to investigate in this series of blog posts. It would be really awesome if you could provide us with some feedback on it and, if you will, maybe contribute with a new category or some other concept we haven’t defined yet.