Unfolding paginated APIs

Johan Boberg
Engineering at Earnest
4 min readDec 13, 2017

--

Here’s a common problem in the world of web development: you have to retrieve a list of items from an API endpoint but the endpoint only returns a subset of the whole list and a pointer to the next chunk. To fetch all items you have to page through the data set by repeatedly calling the same endpoint with an updated offset parameter. An example of this is Spotify’s API.

Implementing an client for such an API usually creates code that is mixing concerns such as building up the data structure, construct the url for the next call and error handling. There exists however a solution to this, and essentially any other problem that consists of building up a data structure using a value from the previous iteration, and it’s called unfold.

Folding and unfolding

To understand what an unfold is we’ll start by describing fold. Using fold, or reduce in JavaScript terms, one can take a data structure and reduce it down to a single value:

The code for fold looks like this:

Unfold is the inverse of this. You start with a single value and unfold it into a data structure. You also need to indicate when you’re done so that the algorithm will stop unfolding. Here’s unfold for a list:

Here’s a recursive version. Although this will cause a stack overflow if used to unfold too long of a sequence, it will come in handy later.

What fold and unfold have in common is that they abstract over the traversal of the data structure and let you focus on the algorithm. downToOne only has to care about the logic of what data to generate, not how to generate it.

Making unfold asynchronous

Here’s an example of a typical way to access a paginated API. Let’s ignore error handling for now:

Notice that this code is very similar to the recursive version of unfold. The only differences are:

  • The work is now being done inside of a Promise*.
  • There’s an extra concat in the last step before the array is returned.

* This is also what makes this recursive version work. doGetItems is not recursing right away, but later when the Promise resolves.

Let’s see if we can make those modifications to the recursive unfold. As a first step, let’s modify unfold to work with Promises.

The next step is to return a value for the terminal condition. We could just modify the contract of fn and include an element when done == true but let’s go a little further.

Let’s formalize the contract by enumerating the possible return values of fn. We can do this by using a concept from functional programming called a tagged union, or sum type, which is a type of algebraic data type (ADT for short). The functional programming library folktale has a module for ADTs that we can use.

This yields the following for our new asynchronous unfold, notice the 0 at the end, which is returned as the final element. The use of the Union ADT makes it very clear that we have separated what should be done from how to do it.

With this we’re ready to tackle the API client. There’s just one problem. The apiClient function returns a Promise of the response from the API and not an instance of Unfold, which unfoldAsync expects. We can solve that by transforming the response from the apiClient using then.

While this works we have now mixed the what and the how, exactly the thing we wanted to avoid by using unfold. Let’s move the transformation to unfoldAsync. As our element is actually a list we’ll also remove the [] around element in concat.

Error handling

Calling an API might not always succeed so we should add error handling as well. For this we will extend our Unfold data type with two new cases — Fail for when we want to abort the unfolding and Retry for when we want to retry the last operation. We’ll also make it possible to define how long we should wait before retrying. I’ll leave adding the number of retries as an exercise for the reader.

If we receive a Fail we’ll return a rejected Promise with the supplied error, which will cause the whole Promise chain to fail. If we receive a Retry we will, after a delay, call ourselves again using the same parameters as the current call.

Putting it all together

Most of the time all endpoints follow the same rules for success and error handling. We should therefore be able to reuse the logic we implemented for those cases. Due to the way unfoldAsync is defined we can partially apply it to the success and error handlers to create a new function that takes an endpoint handler and returns a function that automatically fetches all pages of that endpoint when given an offset.

For an extended example, check out this Spotify API client where I use this pattern in practice.

Summary

By using unfold to separate the concerns of building up a data structure from what to do with the items we have taken a problem and broken it down into its underlying components. Many problems that we face as developers can be broken down in a similar fashion. By doing so the code becomes easier to understand as the same patterns are reused over and over again.

Further Reading

This blog post came about after having read the excellent blog post Anamorphisms in JavaScript by Reginald Braithwaite. I would highly recommend checking it out as it explains unfold in much more detail than I have done here. You will recognize some of the examples as they were my inspiration for the examples in this post.

--

--

Johan Boberg
Engineering at Earnest

Not a sinner nor a saint when it comes to software development. Devoted covert anomaly hunter.