Querying a database with GraphQL and DataLoader, an introduction in Go.

Mikael Paladiants
Jul 7, 2019 · 10 min read

In this article we are going to cover building a simple GraphQL API to retrieve data from an online store database. The focus will be on the GraphQL side of things, but the source code of a fully-working example can be found here.

If you are completely new to GraphQL, there is a great introduction to it at https://graphql.org/learn/. For the purposes of this walkthrough, you can think of it as an alternative to REST that provides a flexible and type-safe way to query APIs for data.

The Database

You can use a database of your choice for this tutorial. In the linked Github repository, a Dockerized Postgres database is used as an easy and disposable solution. The database access code is pretty standard and straightforward, so we will skip listing it here. But here is the Dockerfile for our database:

FROM postgres

Here, we let Postgres know what the user name, password, and database name will be. Additionally, we tell it to run the SQL from init.sql upon start up. That will create a simple database of an online store. Fig. 1 shows its schema.

Image for post
Image for post
Fig. 1. Schema of the store database

To have our database running, we first need to build a Docker image from the Dockerfile:

docker build -t store .

And then run it:

docker run --rm -p 5432:5432 store

The GraphQL library

There are many different GraphQL libraries for Go. Here, we are going to use one by graph-gophers.

GraphQL

In GraphQL, resolvers are responsible for returning, or resolving, object data. For example, an order resolver would resolve order properties like order ID or time of order, by returning their corresponding simple-type values, an integer and a time value. It would also resolve customer that made the order, which could be represented as a complex type that encapsulates customer’s ID, name, address, etc. In this case, the returned value would be another resolver, customer resolver. The list of products in the order could be resolved by yet another resolver. Nesting like that, results in a directed acyclic graph of resolvers, with each of them responsible for providing a part of a potentially larger object.

Image for post
Image for post
Fig. 2. Directed acyclic graph of resolvers

The diagram in Fig. 2 reflects this concept. The root query resolves three orders. Then, the customer and product resolvers could be used to retrieve the corresponding data, but this is optional if all you need is a list of order IDs and order times, for example.

The Schema

Schema is a way of describing the object model and operations supported by your GraphQL service in a strongly-typed fashion. Queries received by the service are validated and execute against the schema. Additionally, there is an out-of-the-box support for introspection of the API of the service, which could be leveraged either by querying schema-specific fields with your own code, https://graphql.org/learn/introspection, or by using tools such as GraphiQL (note a subtle “i” in the name).

Below is the schema for our service.

scalar Time

On the first line the Time scalar type is declared. There is no built-in GraphQL type for time values, but it is possible to define custom ones. The graph-gophers’ library defines the Time type, and here we are just making our schema aware of it. Also note, that the Int type corresponds to Go’s int32, and Float to float64.

Every GraphQL schema has a query type (defined as Query here) that serves as the entry point into any of the queries a service supports, and that is what makes it special. mutation, which we are not going to use here, would be another example of a special type.

Query’s properties define what information it is possible to request from our GraphQL service. For example, orders says that you could get a list of first first Orders. The exclamation marks mean that the first parameter is required, and that neither the returned slice nor any item in it are going to be nil. Conversely, removing !s makes parameters optional and return values nilable.

Similarly, the Order type, aside from its two scalar properties, has a property called customer that returns the order customer data, and a property called products that can be used to request the list of products the order contains. The definitions of the Customer and Product types are straightforward.

You can see how the resolver graph maps onto the schema through the complex type properties.

The Resolvers

On to resolvers now. We will start from the root.

Query

This is the entry point to the orders query we have defined in the schema, as well as all other future queries we might add. db.Client is a client of the database storing order data.

package resolver

The first parameter is the current request’s context, the second encapsulates the query field’s arguments — just one in our case. A list of OrderResolvers, one per order, is the return type.

We query the database for orders by calling the GetOrders method of the DB client, then create an OrderResolver for each order before returning them.

OrderResolver

Now let’s see what OrderResolver looks like.

package resolver

Here, OrderResolver just returns values of the simple-type properties like ID and Time. For the complex-type ones, however, the corresponding child resolvers are returned. Both CustomerResolver and OrderProductResolver are similar to OrderResolver, so we’ll omit their listings.

Before returning a CustomerResolver from the Customer method, we need to get the customer’s data from the database. Same goes for order products. So, if there are three orders, we will need to make three database queries to get the corresponding customer records, and three more to get product lists for each of the orders. Clearly, this is not optimal as it puts a lot of additional load on the database.

DataLoader

To optimize the database querying approach, we will use DataLoader, a utility that provides batching and caching capabilities to application’s data-fetching layer. Returning to the previous paragraph’s example of three orders, the goal here is to get all three customer database records at once instead of doing it one by one.

First, the database client’s GetCustomer method needs to be updated to return a batch of customer records, i.e., from

package db

to

package db

Then, we will define a loader for customers, which will be used inside CustomerResolver to get customer data instead of querying the database directly via db.Client.

package loader

The following method adds a batch-loading capability to the struct.

func (l *customerLoader) loadBatch(ctx context.Context, keys dataloader.Keys) []*dataloader.Result {
n := len(keys)
ids, err := ints(keys)
if err != nil {
return loadBatchError(err, n)
}

DataLoader will batch requests for individual customers from multiple CustomerResolvers, and then call this method passing in a collection of customer IDs. The collection is represented by the dataloader.Keys type, which is defined as a slice of dataloader.Keys. dataloader.Key implements the fmt.Stringer interface and thus can return its string representation. Since customer ID in our case is represented as int32, keys will be string representations of numeric customer IDs. All we need to do to obtain the requested IDs is to perform a conversion from dataloader.Keys to []int32. That’s what the call ints(key) does.

With the IDs in hand, the actual database records can now be retrieved as a batch by calling the database client’s updated GetCustomers method.

DataLoader expects the number of *dataloader.Results returned from the batch-loading method to match the number of dataloader.Keys it has been passed via the keys parameter. In case of an error, this requirement also stands. The loadBatchError method simply returns a slice of the required length in which every element is a pointer to the same dataloader.Result whoseError property is set to the occurred error.

Another expectation set by DataLoader is that the result order matches the key order. That is, if the keys parameter represents customer IDs [3, 2, 1] (in that order), then the corresponding return value []*dataloader.Result must represent customer records in that same order, [customer 3, customer 2, customer 1]. The mustIndex function finds the index of an ID in a list of IDs, and is used to ensure the correct order of the returned values.

Typically, DataLoader instances are created per request to avoid problems with using a single cache for multiple users with different access permissions. We will follow this best practice.

package loader

The Map type maps contextKeys to batch-load functions. dataloader.BatchFunc is defined as a function type with the same signature as customerLoader’s loadBatch method. We will call the Init method during application bootstrapping. Then, the returned Map's Attach method will be invoked per request to add new DataLoaders, created by calls to dataloader.NewBatchedLoader, to the request’s context.

Now, we can leverage the DataLoaders in the context by providing a LoadCustomer function.

package loader

extract finds the requested DataLoader in the context, and the loader’s Load method is called with a dataloader.Key representing the customer’s ID (key(id)). If LoadCustomer, and subsequently Load, is called multiple times for different customer IDs, i.e., from multiple resolvers, then DataLoader batches the IDs and queries the database for all of them at once. Then, Load returns the requested customer’s model.

Finally, we can rewrite OrderResolver’s Customer method to make use of DataLoader.

package resolver

The Result

A GraphQL service always has a single API endpoint which, by convention, is available at /graphql, e.g., http://localhost:8080/graphql.

In order to list, say, the first two orders, we would POST the following payload to the endpoint.

{
"query": "{
orders(first: 2) {
id,
time,
}
}"
}

Here is what the the response looks like:

{
"data": {
"orders": [
{
"id": 2,
"time": "2019-06-26T09:40:39.656311Z"
},
{
"id": 4,
"time": "2019-06-29T09:40:39.6591Z"
}
]
}
}

You can see how the shape of the response matches that of the request. We only asked for the ID and time of each order, thus only giving work to do to OrderResolver and effectively only querying the Orders database table.

Let’s extend the query by also requesting customer names.

{
"query": "{
orders(first: 2) {
id,
time,
customer {
name,
},
}
}"
}

The change is reflected in the response:

{
"data": {
"orders": [
{
"customer": {
"name": "andrew"
},
"id": 2,
"time": "2019-06-26T09:40:39.656311Z"
},
{
"customer": {
"name": "max"
},
"id": 4,
"time": "2019-06-29T09:40:39.6591Z"
}
]
}
}

Each order now has a new customer property that contains a sub-property called name with the name of the customer that placed the corresponding order. The customer data has been provided by two CustomerResolvers which internally used DataLoader to get it from the data store. DataLoader batched both customer IDs and made only one database query.

Now, let’s not request customer details, but instead ask for product list for each order.

{
"query": "{
orders(first: 2) {
id,
time,
customer {
name,
},
products {
name,
quantity,
price,
},
}
}"
}

And here is the response:

{
"data": {
"orders": [
{
"id": 2,
"products": [
{
"name": "soap",
"price": 2.49,
"quantity": 4
}
],
"time": "2019-06-26T09:40:39.656311Z"
},
{
"id": 4,
"products": [
{
"name": "toothpaste",
"price": 3.49,
"quantity": 2
},
{
"name": "face wash",
"price": 8.99,
"quantity": 2
}
],
"time": "2019-06-29T09:40:39.6591Z"
}
]
}
}

Our service will not query the database for customer data in this case, but it will query it for product lists for both orders, as a batch.

Summary

We have had a look at a few GraphQL concepts, including strongly-typed schema, custom data types, resolvers, and flexible queries, and got basic understanding of how it works and what it can do. A more advanced topic that we have covered is database querying optimization using DataLoader.

Nevertheless, we have only scratched the surface here as there is so much more to GraphQL. There are plenty of resources online where you can learn more about it, including the official website, https://graphql.org.

Links

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch

Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore

Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store