Big movements in the graphql-java world

Brad Baker
4 min readAug 31, 2017

--

graphql-java 4.0 is out and its kinda a big deal!

graphql-java 4.0 has been released and it contains some great new capabilities that make it an even more compelling approach to getting data.

Fully Asynchronous Queries

The one capability I am most excited about is that the execution of queries is now fully asynchronous. This allows you to resolve fields in parallel which will speed up most queries.

For example imagine this StarWars query:

query {
hero {
name
friends {
name
}
appearsIn
}
}

The first field hero needs to be resolved first because it is an input to the field selection below it. But the name , friends and appearsIn fields can technically all be processed at the same time because their values do not feed into each other.

Doing three things at once is faster than doing three things one after the other

yes yes I know…not always but most of the time…stay with me here…

Truly asynchronous behaviour is an opt-in thing. The default execution strategy AsyncExecutionStrategy will run the fields on the same thread by default but you can opt-in by having your DataFetcher instances return java.concurrent.CompletableFutures.

A CompletableFuture is a promise to some data. graphql-java will stitch all the promises back together into the required order as defined by the graphql query.

Introducing Java DataLoader

The other capability I am excited about is that with the CompletableFuture support, a new ability is unlocked. Introducing java-dataloader

If you use graphql, you are likely to be making queries on a graph of data (surprise surprise). Nodes in graphs often point to back to themselves and have the same data repeated.

java-dataloader will help you to make this a more efficient process by both caching and batching requests for that graph of data items. If the dataloader has previously see a data item before, it will cached the value and will return it without having to ask for it again.

Imagine we have the StarWars query outlined below. It asks us to find a hero and their friend’s names and their friend’s friend’s names. It is likely that many of these people will be friends in common.

query {
hero {
name
friends {
name
friends {
name
}
}
}
}

The result of this query is displayed below. You can see that Han, Leia, Luke and R2-D2 are a tight knit bunch and share many friends in common.

[hero: [name: 'R2-D2', friends: [
[name: 'Luke Skywalker', friends: [
[name: 'Han Solo'], [name: 'Leia Organa'], [name: 'C-3PO'], [name: 'R2-D2']]],
[name: 'Han Solo', friends: [
[name: 'Luke Skywalker'], [name: 'Leia Organa'], [name: 'R2-D2']]],
[name: 'Leia Organa', friends: [
[name: 'Luke Skywalker'], [name: 'Han Solo'], [name: 'C-3PO'], [name: 'R2-D2']]]]]
]

A naive implementation would called its DataFetchers to retrieve a person object every time it was invoked.

In this case it would be 15 calls over the network, even though the group of characters have a lot of friends in common. With java-dataloader you can make the graphql query much more efficient.

As graphql descends each level of the query ( eg as it processes hero and then friends and then for each their friends), the data loader is called to "promise" to deliver a person object.

At each level of the fields dataloader.dispatch() will be called to fire off the batch requests for that part of the query. With caching turned on (the default) then any previously seen person will be returned for no cost.

More formally a series of promises to retrieve by keys “A”, “B” and “C” will be batched together into to the “batch loader function” as a single call to get the list of keys “A”, “B” and “C”.

N key lookup promises boil down to 1 call for a list of N keys.

In the above example there are only 5 unique people mentioned but with caching and batching retrieval in place there will be only 3 calls to the batch loader function.

3 calls over the network or to a database is much better than 15 calls you will agree.

If you use multithreading capabilities like CompletableFuture.supplyAsync() then you can make it even more efficient by making the remote calls happen in parallel.

Here is how you might put this in place:

One thing to note is the above only works if you use DataLoaderDispatcherInstrumentation which makes sure dataLoader.dispatch() is called. If this was not in place, then all the promises to data will never be dispatched to the batch loader function and hence nothing would ever resolve.

java-dataloader is a pure Java8 port of the original Facebook dataloader library and it is a proven access pattern for making remote data access more efficient.

In closing

The combination of asynchronous graphql field processing and batch loading of data behind those fields will make graphql-java an even more compelling way to access your data.

--

--