Evolving Content Delivery at USA TODAY NETWORK — Part 2

Published in

USA TODAY NETWORK

4 min readJul 17, 2018

In the first post of this series, I explained the advantages of a GraphQL-based API for retrieving USA TODAY NETWORK content. In this post, I’ll explain how we took our existing data and created the GraphQL schema used in that new API.

Constraints

The goal of our project began simply enough: serve our data using GraphQL. As with any project, there are some constraints on our solution:

First, we have to support our existing data, which consists of various different data types and hundreds of fields.
Golang is our language of choice for new code, so we also wanted to use it for building our GraphQL API.
The data will continue to evolve (one of the reasons we selected GraphQL), so we need to predictably and reliably handle this data evolution.

GraphQL Schema in Golang

After evaluating several options, we decided to use the GraphQL implementation from https://github.com/graphql-go/graphql/. This allows building a Go-based API service around GraphQL. There is a catch though : the example code is nearly 500 lines long. Defining the schema and requisite resolving functions for each field in our data was going to require thousands of lines.

Thousands of lines of code isn’t necessarily a problem in and of itself; but the amount of complexity it represents is a challenge. Building this is mostly repetitive work, with slight variations. This is exactly the type of work people easily make mistakes on. Worse yet: this code will have to change, possibly frequently, as we evolve our data schema. Finally, we don’t want to maintain another representation of the schema because we’ve already defined one using the JSON Schema standard, which can be easily shared with other projects.

All of this made it very clear: we needed to automate the creation of our GraphQL schema, ideally leveraging our existing schema.

Open Source Tooling

We accomplished automation of the schema generation for GraphQL through a few tools we wrote for the project and proudly made Open Source.

jstransform generate — https://github.com/GannettDigital/jstransform
This tool leverages go generate to build Golang structs for internal representation of the data from a JSON schema.
graphql-gen — https://github.com/GannettDigital/graphql-gen
Continuing where jstransform generate left off, the graphql-gen library takes Golang structs and builds GraphQL types from them including default resolve functions. The GraphQL schema is then finalized by defining specific queries, any custom resolve functions, custom fields or explicit type relationships. In a nutshell the parts of defining a GraphQL schema, which are very repetitive, are removed and only the sections needing a thoughtful implementation remain. Examples are found at http://godoc.org/github.com/GannettDigital/graphql-gen

Also as part of the jstransform repository we have Open Sourced a library which defines an extension to the JSON Schema standard for transforming JSON data into a format that matches the schema. Though not involved in making schema changes, it is a tool that helps us manipulate the data to match a schema. It aids in data evolution by making it so schema changes in the GraphQL schema are not dependent on the input data schema and vice versa.

Schema Evolution Workflow

With these tools our workflow for evolving schema is handled entirely in the JSON schema file itself. That brings these advantages:

It is a standard format that many who are not Go developers understand so those outside the core team can contribute to the schema.
Cognitive load is reduced. When working on schema changes schema is the focus. There is no need to also be thinking about the Go code, etc.
After updating the JSON schema, changes in the Go code are most often just a version bump to use the new schema and any needed updates to specific test data. This enables quick updates to schema changes approved by the data owners.
Input schema and the GraphQL schema can evolve independently, allowing small focused changes, reducing coordination friction and eliminating schema changes on one part of the system while blocking changes to the rest of the system. Additionally, the relationship between the two is not opaque but rather clearly laid out in the same schema file that the data is defined in.

Conclusion

In part one of this series, I explained how using GraphQL allowed us to more closely align the technical and business representations of our data. The Open Source tooling introduced here enables easy evolution of our data, accomplished by focusing the details onto the data schema. This schema is a standard, language neutral format which content producers and consumers can discuss changes about without considering the details of content delivery service. The end result — data can be evolved quickly based only on the business needs under consideration.

Evolving Content Delivery at USA TODAY NETWORK — Part 2

Written by Tim Kuhlman