Securing your GraphQL Endpoint

GraphQL has been the way of the future when it comes to client-server connections. Its ability to query for the client’s specific needs and mutate data on the spot has almost obliterated my usage of old REST endpoints. The old endpoints provided something that GraphQL doesn’t out of the box: security and consistency. Every time you make a request to a standard REST endpoint, you know exactly the structure of the returned data, along with how computationally expensive the request is going to be.

The Problem

Let’s take an example from an app my company recently completed. This app allowed our customer’s members to post to the app, along with comment and like the posts. A query for the apps home feed looked something like this:

A small snippet from one of our queries

This query is not that complicated; with a few optimizations and by using Facebook’s awesome data loader, we are able to get away with only throwing a handful of simple queries at our database.

This is all fine and dandy, but if the GraphQL endpoint is open to the public, without restrictions, what stops someone from running a query like this:

Yikes!!!

This is a 100% valid query, but I’m sure you can see that this would wreak havoc on the database. We would have to query for over 10,000,000,000 nodes; no amount of data-loading or query optimization is going to make this easy to run. There is nothing in the way of a malicious user throwing this at your server a few times and bogging down the entire service for everyone else.

The Solution

So, what can we do about it? Well, there are two types of GraphQL endpoints we need to consider. Public endpoints are designed such that anyone can implement your data in their own application. An example of this would be GitHub’s GraphQL API. Internal endpoints are designed for the frontend engineers to use when building an app, and no one else. This is what we have been using at A-Squared Digital, and are probably what most companies have implemented. Each of these methods have their own way of

Public Endpoints

These are a little tricky, because it requires some analytics on the speed of the different queries within your application. The easiest way to go about blocking complicated queries is to determine a cost for each query. This is determined before the query is executed, and queries over a certain cost cap will be denied.

To calculate the cost, you can use a plethora of different libraries that exist for this purpose. For us at A-Squared, we use type-graphql to dynamically generate our GraphQL document, and it supports on-the-fly query complexity calculation; here’s the page on that.

This is a general solution and it will have to be fine tuned based on your server capacity and how much your clients are paying for your endpoint. It is also important to authorize your users so that each request comes with an authorization token. Without that, there would be no way to determine if a specific user is causing trouble, or if your customer base as a whole is using more complicated queries than expected.

Internal Endpoints

By knowing the entire list of queries that any front-end is going to use, this gives us the ability to completely lock down the endpoint. We know the solution is to generate a list of each query that is going to be used, so how do we go about implementing this?

Apollo GraphQL, the front-end graphql client we use, has a library called persistgraphql that can help us generate this list. This library creates the ID of the query based on an internal sequential iterator. This means that the first query the script finds will be given the id 1 , the second one 2 and so on. The ApolloNetworkInterface allows the client to replace the query it was going to request with the ID that is generated earlier. The server will then convert that ID back to a query and execute it as normal. The benefit of this is that we can deny any query that is not a specific ID, granting us complete security from arbitrary queries. There is one problem with persistgrapql that required us to create our own branch of the project and use it for development.

The library only allows the sequential ID system of numbering, and this completely breaks cross version support. If one user is using version 1.0 of an app and we added more components with different queries for version 1.1, we would have to rerun the persistgraphql cli and it would create entirely new IDs that have no correlation to the previously generated set. This means that we would also have to send along the version of the app if we are using the sequential IDs, a non-optimal solution.

In my fork of the library, I added a few CLI options that allow for cross update consistency. I added a --hash=[type] option; the type could be one of sequential, md5, sha1, uuid. Sequential is what we have seen by the library before and is the default. Using a hash like md5 or sha1 allows us to provide consistency between our updates. A query that has a specific md5 hash in version 1.0, will have the exact same hash in version 1.1, and new/changed queries will have a completely new hash. This allows us to upload our query map to a Redis instance to allow for extremely fast conversions from the hash to a usable query.

Conclusion

These two methods of securing your endpoint allow you to safely deploy GraphQL without the worry of a query complexity based denial of service attack. Each method is independent of the other, so you could have a service that provides you with customizable simple queries, but more complex ones could require a hash so that they are within your control.