Why Batching Your Apollo SQL DataSources Is Invaluable
An in-depth guide to this amazing feature
In this article, I will explore why batching your database queries via a DataLoader is crucial to get quick Apollo Server Resolvers.
If you are using the SQL with a DataSource (or directly in your resolvers) then you are more than likely to come across the N+1 issue or have unsolved performance issues and although Apollo and the community offer a few different data sources which can be seen below, none of the SQL DataSource
options have a solution to batch your requests.
For this reason, I created a new DataSource which can be seen here:
https://www.npmjs.com/package/@nic-jennings/batched-sql-datasource
This DataSource allows you to easily add batching to your DataSource where it is required without having to add DataLoader initialisations all over your class, which make it messy.
I will demonstate how to implement this and then I will show the difference in performance in using a DataLoader as part of your DataSource.
Setting up the Batched SQL DataLoader
I will assume you have an Apollo GraphQL server installed, but if not, there is a full example in this GitHub Repository.
To utilise the DataSource, first install the package:
npm i @nic-jennings/batched-sql-datasource// or
// yarn add @nic-jennings/batched-sql-datasource
Then create a new file called loader.js
and create your DataSource which extends the BatchedSQLDataSource, and import it into your Apollo Server:
In your resolver, you can now get use the DataLoader we created for bar:
As you can see, the setup is clean and quick.
Setting Up an Example Server
I created the following example here: https://github.com/nic-jennings/batched-sql-datasource/tree/main/example
Using docker-compose to run this with a seeded Postgresql database I ran the queries and got the following results:
Events
Events batched
As you can see, the batched query takes only 57.9ms, whereas the standard DataSource takes 567ms, which is 9.79 times faster! However, you are probably thinking that just making this query in Apollo Studio isn’t a real-life situation. So I wrote some load tests using K6 and ran them against my local docker compose build. You can see them below:
Load Testing — Batched vs Standard
Setting up the below thresholds which I feel are fairly lenient and taking into consideration that I am running them against a local docker compose stack.
- HTTP errors should be less than 1%
- 95% of requests should be below 250ms
thresholds: {
http_req_failed: ["rate<0.01"],
http_req_duration: ["p(95)<250"],
},
Running the tests which can be seen here. Provides the following results:
Standard datasource
Batched datasource
Results
The first thing to note is that the standard DataSource failed the load test. Over 95% of the tests took more than 250ms — the average request duration was 1.15 seconds, and the max took just under three seconds which with only a small dataset is a shocking response time. I think this truly demonstrates the N+1 issue and how a DataLoader can solve the issue. For example, the batched DataSource took on average 22.31ms to complete a request with the longest taking 132.98ms.
Conclusion
I personally feel the above undeniably shows the difference using a DataLoader has on your GraphQL request times. I hope you agree that the package I created helps making and using DataLoaders in Apollo easy and keeps your code clean.
Thank you for taking the time to read my article. I hope you have found it informative and interesting. I will write more articles around TypeScript, Node, React, Vue, GraphQL, Performance, Go, and more.