Using Data Loaders with Apollo Federation

Christopher Gustafson
Volvo Car Mobility Tech

--

This is the third part in a series of blog posts on the topic of Apollo Federation, and how we use it at Volvo Car Mobility. If you want to read more, make sure to check out the other posts we wrote about Apollo Federation.

In this blog post we will be covering Data Loaders and the N+1 problem. We assume you have some basic knowledge of how GraphQL and Apollo Federation works. There will also be some code examples written in Kotlin.

The N+1 Problem

A common problem when creating GraphQL APIs is the N+1 problem. Let’s consider an example. We have one subgraph responsible for resolving information about a car, like its car model, and we have another subgraph responsible for keeping track of stations and what cars are at a certain station:

# Car subgraph
type Car @key(fields: "id") {
id: ID!
model: String!
}
# Station subgraph
type Station {
id: ID!
cars: [Car!]!
}

A station contains a list of N cars. Say we want to query the model field of each car in the station. This information lives in the car subgraph, which means N queries need to be made to resolve the model of each car. So the result is one query for the list of cars, and N queries for the model of each car. Hence, N+1 queries.

With Federation, the gateway will avoid making N queries to the car subgraph by instead making one batched query for the model of all cars. However, the reference resolver of Car in the car subgraph needs to be modified to properly handle a batched query. If not, the resolver will still be called N times, which in turn might cause N database queries. This can be avoided by using Data Loaders in DGS.

Using a Data Loader

When using a DGS data loader, the framework will recognise that many cars need to be loaded, batch up all the calls to the data loader, and call the data loader with a list of cars. Hence, avoiding the N+1 problem. The following is an example of a CarDataLoader in the service responsible for the car subgraph:

@DgsDataLoader
class CarDataLoader(
private val carService: CarService
): MappedBatchLoader<String, Car> {

override fun load(keys: Set<String>): CompletionStage<Map<String, Car>> =
CompletableFuture.supplyAsync {
carService.getCars(keys.toList()).associateBy { it.id }
}
}

In the Data Loader, we define the function load() which takes a set of keys and fetches them in batch by calling CarService.getCars().

We then use the data fetcher in our reference resolver by calling DataLoader.load() with the key we want to load. In the case of a reference resolver, it will be the id of the entity we are resolving which is passed by the federation gateway.

@DgsEntityFetcher(name = "Car")
fun car(
map: Map<String, Any>,
dfe: DgsDataFetchingEnvironment
): CompletableFuture<Car> {
val carId = map["id"].toString()
val dataLoader: DataLoader<String, Car> = dfe.getDataLoader(CarDataLoader::class.java)
return dataLoader.load(carId)
}

We prefer MappedBatchLoader over BatchLoader

We want to highlight one detail in the data loader above, the use of MappedBatchLoader in favour of BatchLoader. Using BatchLoader, our data loader would look like the following:

@DgsDataLoader
class CarDataLoader(
private val carService: CarService
): BatchLoader<String, Car> {

override fun load(keys: List<String>): CompletionStage<List<Car>> =
CompletableFuture.supplyAsync {
carService.getCars(keys)
}
}

In the DGS data loader documentation, the first example presented uses BatchLoader, which seems to be the default option for data loaders. But a small implementation detail, that can only be found in the source code documentation of BatchLoader , reveals why it is error-prone:

There are a few constraints that must be upheld:

- The list of values must be the same size as the list of keys.

- Each index in the list of values must correspond to the same index in the list of keys.

This pitfall can be quite tricky to understand when just reading the online documentation. BatchLoader requires you to know that the order of your values is the same. If you for example rely on a call to a database for fetching your values, the order might not be guaranteed. The possible effects of this are also hard to catch as it doesn’t cause any errors. A list of cars is still returned by the BatchLoader implementation, but when finally loading a single car, you might end up getting the wrong car. This can be devastating!

If you want to read more about our work with Federation at Volvo Car Mobility, make sure to check out the other blog posts we wrote on the topic.

Join the movement

If this blog post made you interested in working with Apollo Federation at Volvo Car Mobility, make sure to explore our careers page for open roles.

Written by Christopher Gustafson, Iman Radjavi and Alexander Lindquister.

--

--

Christopher Gustafson
Volvo Car Mobility Tech

Backend Engineer @ Volvo Car Mobility. Interested in all things tech!