The Beginner’s Guide to GraphQL

It’s hard to understand what’s going on with GraphQL, or any technology really, just by reading somebody’s blog. I highly recommend checking out the GraphQL interactive documentation as well as various GraphQL apis and playgrounds, changing up the queries, and watching the outputs change.


Since Facebook open-sourced React in May of 2013, it has skyrocketed in popularity, vastly outpacing Angular, Vue and other frontend frameworks/libraries. React Native followed up on React, applying the lightweight framework’s core principles to mobile development. React Native works for both iOS devices and Android devices, making it the de facto choice for many mobile developers these days. So it comes as no surprise that when Facebook says “jump” in the world of API architecture, the developer community responds with a resounding “how high?”. Okay, maybe “how high” is a little too emphatic and too willing to follow trends blindly. But as a developer keeping up with the latest concepts and implementations, you would be remiss to ignore this technology out of Menlo Park.

Enter GraphQL

GraphQL is a query language for APIs and a server-side runtime for executing those queries. Rather than hitting a particular endpoint to retrieve or submit some data, a GraphQL API all lives under one endpoint — the aforementioned runtime. What information you want, and what shape you want it to take, is dictated by how you structure your query. Instead of assigning your server the responsibility of aligning structured responses to particular endpoints, the GraphQL runtime has access to all of the data and gives you back exactly what you (the client) requested, exactly how you requested it.

Facebook’s motivation for creating a graphical query language as opposed to utilizing traditional RESTful routes was the rise in popularity of mobile applications. An API was the obvious choice to communicate Facebook’s data in a consistent format to be consumed by native and web applications alike. If you think about the mind-bogglingly large number of object relationships that must exist in Facebook’s backend model (a User has many Friends, of type User, Mutual Friends also of type User, Photos, Photos that are owned by one User but have several other Users tagged, etc etc etc) — it turns out to be much more like a social graph than a tabular relationship. It doesn’t really make sense to have to make so many queries to specific endpoints, or to make custom endpoints for each possible screen one could end up on. Instead, the client can structure a single request, including nested objects and relationships and fields, and get back exactly what it needs — nothing more, nothing less.

Ask and you shall receive

There are many benefits to using GraphQL over REST APIs.

Efficiency

As discussed, sometimes you need complex webs of data to render the full picture — a graph if you will. The GraphQL layer between the client and the data storage removes the need for several round-trips. Instead, the client can say precisely what data it wants and the GraphQL responds with exactly that, in a shape mirroring the request.

Flexibility

As front-end teams and backend teams work alongside one another, their needs are often changing at different speeds. One change in UI can drastically change the shape or amount of data needed. GraphQL provides flexibility in that the client can easily tweak its request with no real implication on the backend.

Versioning (or lack thereof)

On REST APIs, any change can fundamentally change the way the rest of the API works. Versioning allows API owners to add features and phase out others without breaking the ability to continue fulfilling the many requests it gets. GraphQL, on the other hand, only responds with the data that is requested. New capabilities can be added without messing with the old ones, all in one neat up-to-date version.

Protocol, not storage

GraphQL doesn’t dictate how to store your data. It is merely a layer that sits on top of the database that does the heavy lifting — understanding a request, finding the relevant data, and shaping the response. GraphQL has many sever libraries, allowing you to add in the GraphQL layer over your existing databases and/or flexibility in choosing your data storage. This is obviously gets harder the more mature an application is, but the option remains.

Tradeoffs…

…because there always are.

Vulnerability

The point that the client can specify exactly what data it wants has hopefully been driven home by now. However, in theory, it exposes your server to requests that are simply too large to fill. If you are creating a public API with GraphQL, it is important to be cautious of these overly-demanding requests and implement time-outs, rate limits, whitelists, or some other defenses.

HTTP by Default

Every request from the client is a POST request to that single GraphQL endpoint, so it can send either a “query” (what we think of as a GET request) or a “mutation” (what we think of as POST/PUT/PATCH/DELETE requests) in the body. GraphQL is only using HTTP pipelines because it has to, as HTTP is the default transportation of information. GraphQL completely disregards the capabilities of HTTP and uses POST requests as one gigantic pipeline. The problem? POST requests are often the most difficult to cache. Which leads us to a variety of caching issues …

Cache Complexities

Because GraphQL is so useful when your data is deeply interconnected through various relationships, there are many ways you can arrive at a single data point. There are also many permutations of fields you can request about a single data point. Straightforward caching of data based solely on how you got to it or solely on the object identifier will lead to a ton of cache misses — there are just too many ways to get the same information and too many permutations of that information! This leads to the need for superficial fixes (inefficient and not long-term solutions) or caching layers around the globe (infeasible for those of us who do not have Facebook’s resources).

Getting Under The Hood

The value of knowing certain metrics about the requests to your API cannot be overstated — knowing which requests are being made and how often can help you understand how your data is being used. Knowing which requests are resulting in time lags or errors can help you focus your triage efforts. Having detailed metrics can help you understand the impacts of your efforts and if they’re worth continuing. Because GraphQL schemas are strongly typed and strictly structured around fields and attributes, each field’s individual performance can be measured as well as bigger-picture data like request origins, frequencies, response times, etc.

Depending on how you implement GraphQL, there are various tools and technologies to help you access and understand the core analytics of your API. Apollo is an external dependency that does much of the heavy lifting of implementing GraphQL on top of your existing backend. Within the Apollo suite of technologies, Apollo Engine displays metrics regarding the performance of your API in digestible formats.

With Apollo Engine, you can:

  • Trace and time query executions
  • Cache query results at the edge of your GraphQL layer
  • Track errors
  • Analyze the performance and popularity of each field in your schema
  • See the history of your data and the trends that emerge over time

Scaphold, a GraphQL backend as a service provider, also provides tons of metrics, including:

  • Request counts
  • Average response times
  • User growth (number of new users signing up)
  • Data thoroughput (amount of data, in MB, being sent to client)
  • Resolvers by type (to know which parts of the API are being used the most)
  • Error counts, overall and broken down by type
  • Application logs

RealScout, a data-driven real estate search company utilized NewRelic in their GraphQL layer, allowing them to get a better understanding of each call being made to the API. The specifics of their implementation can be found here, and discuss some of the initial challenges and solutions RealScout faced.

GraphQL changes the way we think about APIs — placing importance on queries rather than endpoints, on graphical rather than linear connections, and empowering the client rather than the server. By no means has GraphQL entered the API world and rendered REST obsolete. However, by posing an alternative and more flexible structure, GraphQL gives developers one more thing in their toolkit to utilize when the situation calls for it.

Stay tuned as I walk through my own experience building a GraphQL. Part two coming soon …

Sources