Day 7: Haxl — The Reader Monad, When Reads Are Expensive

Ben Clifford
Twelve Monads of Christmas
4 min readDec 14, 2016

In the Reader monad, all you can do is query the environment by asking for the whole environment. If you only want a little bit of that environment, you can apply a function to the environment. For example, you might ask for the username from some global configuration environment like this:

getUsername <$> ask :: Reader UserName

You can read other things in the world, but it might be much more expensive to do so and you’d usually do those inside IO — for example, database queries or talking to a web API.

Database queries don’t look like Reader — you don’t want the whole of your large database returned into a Haskell value, over which you then fmap some kind of projection that throws it mostly away. Instead you might want to send off some query to be remotely executed (for example with postgresql-simple); and that query might take a while to execute while your local hardware sits round doing nothing.

If we have a few queries we know we need to run, we might run them simultaneously. That’s achievable, but a bit awkard to do in IO — you can forkIO one thread per query, and get the results back, and can add a bunch of mess that makes your program harder to follow and to debug.

With Haxl, you can write queries in one place without caring too much about how they will be executed; and write data sources that know how to evaluate those queries; and then let the library manage how those pieces fit together.

Lets pretend we can send doubles off over the internet to get their square root, slowly. A Haxl query for that might have this type signature:

slowSqrt :: Double -> GenHaxl () Double

After boilerplate initialisation, Haxl will take queries:

> env <- initialise
slow square root engine: initialising
> runHaxl env (slowSqrt 64)
slow square root engine: evaluating 1 roots
slow square root engine: evaluating root for 64.0
slow square root engine: done with this batch
8.0

We asked for a single root, and got back the answer. Long winded but pretty straightforward. This is a monad, so we can bind sequences of actions:

> runHaxl env $ do
| x <- slowSqrt 16
| y <- slowSqrt 25
| return (x + y)
|
slow square root engine: evaluating 1 roots
slow square root engine: evaluating root for 16.0
slow square root engine: done with this batch
slow square root engine: evaluating 1 roots
slow square root engine: evaluating root for 25.0
slow square root engine: done with this batch
9.0

Our engine ran a query batch (of one root), then ran another query batch (for the other root). But it could have done them at the same time.

This is where being a monad starts getting in the way. This desugars to: slowSqrt 16 >>= \x -> slowSqrt 25 >>= \y -> return (x+y) and you have to have a value for x from the first query before you can figure out what to do next.

Luckily there’s a structure that is not quite a monad, Applicative, where we have to specify all our actions ahead of time. Instead of binding with >>= the applicative equivalent is <*>. The above monadic query can be written applicatively like this:

> runHaxl env $ (\x y -> x + y) <$> slowSqrt 16 <*> slowSqrt 25
slow square root engine: evaluating 2 roots
slow square root engine: evaluating root for 25.0
slow square root engine: evaluating root for 16.0
slow square root engine: done with this batch
9.0

It gives the same answer, but our square root engine got both square root requests in one batch: all of the actions were evaluated without feeding values between each other, and then bound to variables to add up in a single lambda.

As with writing out >>= expressions, writing stuff explicitly with <*> can be awkward, and recent GHC has do notation that desugars to Applicative where it can, and Monad where it cannot. This shouldn’t change the behaviour of code much, but if we re-run the first do expression with -XApplicativeDo set, magically we get a single query batch:

> runHaxl env $ do
| x <- slowSqrt 16
| y <- slowSqrt 25
| return (x + y)
|
slow square root engine: evaluating 2 roots
slow square root engine: evaluating root for 25.0
slow square root engine: evaluating root for 16.0
slow square root engine: done with this batch
9.0

Even better, ApplicativeDo will mix-and-match monadic and applicative binds as needed. In the following, we get two query batches because the third query for z has to happen after the first query for x.

> runHaxl env $ do
| x <- slowSqrt 16
| y <- slowSqrt 25
| z <- slowSqrt x
| return (x + y + z)
|
slow square root engine: evaluating 2 roots
slow square root engine: evaluating root for 25.0
slow square root engine: evaluating root for 16.0
slow square root engine: done with this batch
slow square root engine: evaluating 1 roots
slow square root engine: evaluating root for 4.0
slow square root engine: done with this batch
11.0

So we’ve managed convey what was happening in do notation to the monad (and applicative) instance, in a way that for Haxl happens to be useful for parallelisation.

It’s not always useful, though — for example, in IO things have to run sequentially even if there is no value dependency because one IO action might affect another IO action via the real world rather than via a value — something that is avoided in Haxl.

--

--