Functional database request accumulation

Ryan Brewster
Keep It Up
Published in
3 min readMay 13, 2016

This blog post deals with optimizations for large numbers of strictly independent database queries.

Dependent vs Independent Queries

It is common in our codebase to have code like

where we make multiple requests to the database and each depending on the previous request. These types of computations form a monad, and Slick 3.0 has done an excellent job of allowing you to harness a lot of abstractions based on that fact. However, by their very nature they must be performed in sequence.

By contrast, consider code like

which involves multiple independent queries. The code, as written, will require many round-trips to the database. We can make this faster by performing the requests in parallel, but that may overload the database, and it doesn’t address the overhead of multiple round-trips. A better strategy is to make a “batch” request, which retrieves many records from the database in a single trip, eliminating much of the overhead.

This becomes harder to do when the input is more complex.

Inefficiently Performing Many Independent Requests

Suppose you want to send a bunch of notifications to users. For simplicity, let’s start by considering a setup where users can share links with each other by posting them to a group page.

This will involve 3n round trips to the database. As we discussed previously, one way to cut down on this overhead is to perform a batch retrieval; in this case, we can prefetch all of the required user/group/post entities from the database in batch and store them in-memory.

This prefetching approach has reduced the number of database queries to 3, but it pushes some bookkeeping logic onto the programmer. Consider what happens when we want to generate notifications for more than just new posts.

The complexity ramps up quickly, and it becomes easy to introduce bugs by forgetting to prefetch the full set of required entities (can you spot the bug in the above code?).

Introducing BatchFetchable, a functional database accumulator

One solution is to create a wrapper around these computations. We call the wrapper for type T a BatchFetchable[T]. A BatchFetchable[T] knows what entities it needs, and how to generate a T if it were given those entities.

We can generate BatchFetchable[String] instead of String, and then later on decide how we want to actually go about fetching the right entities.

There are still a few issues with this. We haven’t addressed how to take a Seq[BatchFetchable[String]] and turn it into something useful, and we still have to manually provide the required keys — if we forget one we’ll end up throwing exceptions.

Play! has really amazing functional combinators (featured in a previous blog post about JSON formatting) that can get us out of this mess.

Combinators for BatchFetchable

The broad overview of this is that we’re going to define primitive BatchFetchable methods for each entity, and combinators that can compose them together easily, so we’ll never have to manually specify the keys we want at all.

The Play! combinators require two implementations, a Functor[BatchFetchable] and a FunctionalCanBuild[BatchFetchable].

which means now we can write pretty simple code like

With these combinators in hand, the generateNotifications code can be written without every manually defining the keys required; the BatchFetchable will keep track of everything we ask for and compose the keys and entities appropriately.

Once we’re ready, we can “run” the BatchFetchable.

Using BatchFetchable allows us to write code that is simple and performant for cases where we need to retrieve a bunch of independent records from the database. It reduces load on the database and is less error-prone than prefetching.

We wrote this post while working on Kifi — Connecting People with Knowledge.

--

--