Five Common GraphQL Problems and How Neo4j-GraphQL Aims To Solve Them

Digging Into the Goals of A Neo4j-GraphQL Integration

--

A few weeks ago I came across an article from Sacha Greif on freeCodeCamp titled “Five Common Problems in GraphQL Apps (And How to Fix Them)”. I thought this was a good overview of some of problems developers encounter when adopting GraphQL. As I read through the list of common problems I realized these were some of the exact same issues that users had complained about when we were researching what a Neo4j-GraphQL integration would look like. Ultimately, the design of our integration aimed to help developers be more productive when building GraphQL services backed by Neo4j.

The Neo4j-GraphQL integration is available either as a plug-in extension for Neo4j, or as a JavaScript library designed to work with any of the JS GraphQL server implementations (graphql-tools, graphql-js, etc). Arne Wossning wrote a great overview of getting started with Neo4j-GraphQL, check it out here.

In this post I’d like to revisit each of the five problems that Sacha points out and show how Neo4j-GraphQL addresses each of those issues. Sacha’s post does a great job of showing how each of the problems are addressed currently, both in the GraphQL community in general and specifically in his VulcanJS framework, so be sure to read his post to see some of the tools and patterns that exist for solving these problems with standard GraphQL deployments.

The Problems

The 5 common GraphQL problems that Sacha points out are:

  1. Schema Duplication
  2. Server/Client Data Mismatch
  3. Superfluous Database Calls
  4. Poor Performance
  5. Boilerplate Overdose

Problem 1: Schema Duplication

Namely, you need one schema for your database, and another one for your GraphQL endpoint. — Sacha Greif

GraphQL makes use of a strictly defined schema, which defines the types available and the entry points for the API. This schema acts as the specification for the GraphQL API, and with introspection enables powerful developer tools such as query completion, mocking, and documentation generation.

However, standard GraphQL implementations often require working with both a schema for your database and a schema for your GraphQL API. To simplify the process of building GraphQL applications backed by Neo4j, the Neo4j-GraphQL integration uses the GraphQL schema to infer what the Neo4j data model should be.

Solution: Use the GraphQL schema to drive the Neo4j database model.

The neo4j-graphql project uses the GraphQL schema to drive the Neo4j database model (although there is also the option to generate a GraphQL schema from an existing Neo4j database). This fits with a development paradigm known as GraphQL First Development: everything starts with the GraphQL schema.

GraphQL schemas are typically defined using the SDL (schema definition language), now part of the GraphQL specification. Consider this simple GraphQL schema describing movies, genres, and actors:

Starting with this GraphQL schema, using the Neo4j-GraphQL integration we automatically infer a data model for Neo4j that looks like this:

The labeled property graph data model generated by neo4j-graphql from the GraphQL schema shown above.

Because the data model used by GraphQL is a graph, this maps quite nicely to the labeled property graph model used by graph databases like Neo4j, resulting in only a single schema to define both the data available in the GraphQL API and the data model for the database.

Problem 2: Server/Client Data Mismatch

Your database and GraphQL API will have different schemas, which translate into different document shapes. — Sacha Greif

If the backend for your GraphQL service is not a graph database, then there is some mapping and translation that must occur to transform the data from how you model it at the data persistence layer to the shape of a graph for GraphQL. By using a graph database as the data layer for our GraphQL service we preempt this problem, so instead I’ll talk about how we fetch data from Neo4j using Neo4j-GraphQL.

The Neo4j-GraphQL integration translates any arbitrary GraphQL request to Cypher, the query language for graphs, and handles the database call as part of the GraphQL resolver.

Solution: Translate GraphQL to Cypher

The Cypher query generated from a GraphQL request using neo4j-graphql

By generating Cypher queries from GraphQL requests this means that developers do not need to implement resolver functions (which define how to actually fetch data from the data layer) — this is all handled by Neo4j-GraphQL. However, individual resolvers can be overridden, allowing for custom logic when the default resolver is not desired.

Problem 3: Superfluous Database Calls

Imagine a list of posts, each of which has a user attached to it. You now want to display 10 of these posts, along with the name of their author. — Sacha Greif

As Sacha point out, for the example above a typical GraphQL implementation makes one database query for the list of posts, then one query per post to fetch the user. This results in 11 round trip requests to the database! 😲

This is known as the n+1 query problem and the common solution is to use a tool like Dataloader to help batch our queries and cache objects based on ids so that they are only fetched once from the database.

We can certainly use Dataloader with Neo4j — it is designed to be data layer agnostic — but with Neo4j-GraphQL we have the advantage of generating a single Cypher query for any arbitrary GraphQL request. This means for any GraphQL request we make only a single request to the database.

Solution: Translate GraphQL to a single Cypher query

Problem 4: Poor Performance

on one hand you want to take full advantage of GraphQL’s graph traversal features (“show me the authors of the comments of the author of the post of…” etc.). But on the other hand, you don’t want your app to become slow and unresponsive. — Sacha Greif

While it is true that GraphQL enables the expression of graph traversals like the example above, many of the database systems responsible for resolving the data are not optimized for these workloads. Imagine the JOIN statements required in SQL or multiple round trips to Mongo to resolve documents by id in a graph traversal and you start to see where the performance of these queries breaks down.

Graph databases like Neo4j are optimized for graph traversal queries like this. By translating GraphQL to Cypher we can take advantage of the powerful performance benefits of using a graph database execution engine like Neo4j.

Furthermore, GraphQL lacks the semantics of a database query language for expressing things like filtering, projects, or aggregations. Through the use of GraphQL schema directives we can use the power of Cypher with GraphQL to map a GraphQL field to the result of an arbitrary Cypher query.

Solution: Expose the power of Cypher in GraphQL

Consider an update to our Movie type in the GraphQL schema, adding a similar field:

This similar field on the Movie type evaluates to an array of other movies (in this case movies that share the same genre, but you could imagine a more complex collaborative filtering type personalized recommendation query.

Only when the similar field is requested in a GraphQL query, the annotated Cypher query is run as a sub-query, still resulting in a single request to the database.

Fields annotated with a `@Cypher` directive in the schema allow for mapping the results of a custom Cypher query to a GraphQL field. This exposes the power of Cypher in GraphQL.

These @Cypher schema directive fields can be used for custom mutations and on Query types as well — allowing for defining custom logic using Cypher.

Problem 5: Boilerplate Overdose

This is by no means an issue exclusive to GraphQL apps, but it’s true that they generally require you to write a lot of similar boilerplate code. — Sacha Greif

Implementing a typical GraphQL service involves writing a schema for the GraphQL service, a schema for the database, resolver functions to fetch the data, mutations for creating and updating data. Much of this is boilerplate code that can be generated by inspecting the GraphQL schema.

Solution: Auto generate Query and Mutation types from GraphQL schema.

We mentioned previously that resolvers are implemented automatically by inferring the database schema from the GraphQL schema and translating GraphQL to Cypher (also handling the database call). Additionally, the entry points for the GraphQL service (Query and Mutation types) are auto-generated as well, reducing the boilerplate code necessary to implement a GraphQL service backed by Neo4j. In addition, first, offset, filter-fields, ordering for both top-level queries and fields pointing to other entities are generated.

Query and Mutation types are generated automatically when using Neo4j-GraphQL

Resources

You can learn more about the neo4j-graphql and GRANDstack projects here:

--

--