Unfolding paginated APIs
Here’s a common problem in the world of web development: you have to retrieve a list of items from an API endpoint but the endpoint only returns a subset of the whole list and a pointer to the next chunk. To fetch all items you have to page through the data set by repeatedly calling the same endpoint with an updated offset parameter. An example of this is Spotify’s API.
Implementing an client for such an API usually creates code that is mixing concerns such as building up the data structure, construct the url for the next call and error handling. There exists however a solution to this, and essentially any other problem that consists of building up a data structure using a value from the previous iteration, and it’s called unfold.
Folding and unfolding
To understand what an unfold is we’ll start by describing fold. Using fold, or reduce in JavaScript terms, one can take a data structure and reduce it down to a single value:
The code for fold
looks like this:
Unfold is the inverse of this. You start with a single value and unfold it into a data structure. You also need to indicate when you’re done so that the algorithm will stop unfolding. Here’s unfold for a list:
Here’s a recursive version. Although this will cause a stack overflow if used to unfold too long of a sequence, it will come in handy later.
What fold
and unfold
have in common is that they abstract over the traversal of the data structure and let you focus on the algorithm. downToOne
only has to care about the logic of what data to generate, not how to generate it.
Making unfold asynchronous
Here’s an example of a typical way to access a paginated API. Let’s ignore error handling for now:
Notice that this code is very similar to the recursive version of unfold. The only differences are:
- The work is now being done inside of a Promise*.
- There’s an extra
concat
in the last step before the array is returned.
* This is also what makes this recursive version work. doGetItems
is not recursing right away, but later when the Promise resolves.
Let’s see if we can make those modifications to the recursive unfold. As a first step, let’s modify unfold
to work with Promises.
The next step is to return a value for the terminal condition. We could just modify the contract of fn
and include an element
when done == true
but let’s go a little further.
Let’s formalize the contract by enumerating the possible return values of fn
. We can do this by using a concept from functional programming called a tagged union, or sum type, which is a type of algebraic data type (ADT for short). The functional programming library folktale has a module for ADTs that we can use.
This yields the following for our new asynchronous unfold, notice the 0 at the end, which is returned as the final element. The use of the Union
ADT makes it very clear that we have separated what should be done from how to do it.
With this we’re ready to tackle the API client. There’s just one problem. The apiClient
function returns a Promise of the response from the API and not an instance of Unfold
, which unfoldAsync
expects. We can solve that by transforming the response from the apiClient
using then
.
While this works we have now mixed the what and the how, exactly the thing we wanted to avoid by using unfold. Let’s move the transformation to unfoldAsync
. As our element
is actually a list we’ll also remove the []
around element
in concat
.
Error handling
Calling an API might not always succeed so we should add error handling as well. For this we will extend our Unfold
data type with two new cases — Fail
for when we want to abort the unfolding and Retry
for when we want to retry the last operation. We’ll also make it possible to define how long we should wait before retrying. I’ll leave adding the number of retries as an exercise for the reader.
If we receive a Fail
we’ll return a rejected Promise with the supplied error, which will cause the whole Promise chain to fail. If we receive a Retry
we will, after a delay, call ourselves again using the same parameters as the current call.
Putting it all together
Most of the time all endpoints follow the same rules for success and error handling. We should therefore be able to reuse the logic we implemented for those cases. Due to the way unfoldAsync
is defined we can partially apply it to the success and error handlers to create a new function that takes an endpoint handler and returns a function that automatically fetches all pages of that endpoint when given an offset.
For an extended example, check out this Spotify API client where I use this pattern in practice.
Summary
By using unfold to separate the concerns of building up a data structure from what to do with the items we have taken a problem and broken it down into its underlying components. Many problems that we face as developers can be broken down in a similar fashion. By doing so the code becomes easier to understand as the same patterns are reused over and over again.
Further Reading
This blog post came about after having read the excellent blog post Anamorphisms in JavaScript by Reginald Braithwaite. I would highly recommend checking it out as it explains unfold in much more detail than I have done here. You will recognize some of the examples as they were my inspiration for the examples in this post.