How Do I Make Pagination With GraphQL Dataloader

Published in

The Startup

4 min readSep 7, 2020

Since its release in 2015, GraphQL, which was developed by Facebook, is predicted to be the most advanced technology that will replace the role of REST API to communicate between client and server. According to the official documentation, GraphQL is a query language for the API and runtime for fulfilling user requests. GraphQL provides a complete and understandable description and documentation of data in the API, giving clients the freedom to only request the data they need, making it easier to develop the API over time, making it a powerful development tool.

Dataloader itself, according to its official documentation, is called “a generic utility to be used as part of your application’s data fetching layer to provide a simplified and consistent API over various remote data sources such as databases or web services via batching and caching”. Dataloader is also a library developed by Facebook. Although it can be used in other conditions, Dataloader is generally used in conjunction with GraphQL to handle multiple requests to the database (N+1 problem).

After a while of use, we started to notice that Dataloader doesn’t have a built-in API for pagination. Then, how do we solve it? Do we really need pagination?

Let’s hack!

ERD

We will build an e-commerce platform where each merchants will have one locations (one-to-one) and will have many products (one-to-many). The ERD scheme can be seen as follows.

GraphQL Schema

After defining the ERD, next, we will define the schema. Here is a pre-designed GraphQL schema (without pagination).

Basic schema

GraphQL Resolver

And also, the resolver itself (without pagination).

Basic resolver

GraphQL Instance and Dataloader

To initiate a Dataloader instance on every request and also for convenience, I place it in the GraphQL context.

GraphQL instance with Dataloader on its context

The code snippet above is a typical example of using a Dataloader. We have a schema relation where the location and products fields return all data related to merchants.

The problem arises when a merchant has a lot of products. Let’s say merchant A has 1000 products. Returning 1000 products at once is not a good idea. Besides having problems with memory consumption, the bandwidth to be taken will also be large.

The solution? Pagination!

A common technique for handling large data is to use pagination. And yes, we need that to work with Dataloader. Generally, to use pagination, the client must send some supporting input variables such as limit, offset, orderBy, sortBy, and search. Let’s implement it!

GraphQL schema with pagination

After defining the pagination input, then, we have to pass the pagination argument to the Dataloader. We can do this together with the argument id of the load method. However, in my opinion, this is “messy” and breaks the principle of using arguments in the Dataloader. So, instead of calling the Dataloader instance, I call a function that has a pagination argument by returning the Dataloader instance.

After passing the pagination arguments to the Dataloader, are we done? Not yet. The next challenge is how we do one request to the database and get all the data we need. Do we need additional libraries? No. We only need to use a query that is rarely used, namely UNION. A union is great for the use of intersecting data by returning only unique data. So it can reduce memory consumption. An example of using union on Knex is as follows.

Dataloader with pagination

Then, we need to update our resolvers either.

Call loader as a function with argument instead of object

To understand how this loader works, take a look at the illustration below.

>> merchantIds = ["0713c0b4-a068-4909-b851-a4fcdad4d111", "3ec5d4f6-1459-4fcc-990a-3791bad34843"]>> (SELECT * FROM products WHERE `merchantId` = "0713c0b4-a068-4909-b851-a4fcdad4d111" ORDER BY `name` ASC LIMIT 0,10) UNION (SELECT * FROM products WHERE `merchantId` = "3ec5d4f6-1459-4fcc-990a-3791bad34843" ORDER BY `name` ASC LIMIT 0,10)<< [{"id":"98bfa6a5-5558-4d21-9c6c-db5d6267acc4","name":"Kecap Bango 400ml","merchantId":"0713c0b4-a068-4909-b851-a4fcdad4d111"}, ..., {"id":"9d44ca99-6f87-41c0-bf70-bc50db3d2707","name":"Apple iPhone 11 Pro","merchantId":"3ec5d4f6-1459-4fcc-990a-3791bad34843"}, ...]

And that’s all. Now we can paginate the products field instead of returning all its data. How? Easy right?

If you are asking, why should I declare the Dataloader instance in a variable first instead of returning it right away like the snippet below?

Dont do this: Direct return Dataloader instance

The purpose of calling the Dataloader on the relations defined on the resolver is to batch each request, then request collectively in the same instance. If we initiate a new instance on each request, the result will be the same without using the Dataloader. Maybe you should try it yourself so you can get a better understand of this concept.

That’s all I can share with you this time. Hopefully, this article can help you if you face the same problem. Next, I’ll write about Dataloader's ambiguity and how to solve it. So, until next time :wave: