Why GraphQL?
Introduction to GraphQL and reasons to consider using it
- Have you heard of the term GraphQL, and not sure what it is about?
- Do you think GraphQL is some sort of a Graph Database?
- Do you believe GraphQL is a replacement for REST?
- Do you think GraphQL is only used by React/JavaScript developers?
- Do you own and maintain a fleet of REST endpoints and are tired of constantly checking with the client teams, on the usage of these endpoints or on the return types in the responses?
If you have answered “yes” to any of these questions, then this could be a good reading for you. There are some common misconceptions about what GraphQL is, and I hope to address some of these with a series of articles.
I was at Chain React 2018 recently, and have had some great interactions with Application Developers and Engineers around the world. One thing that stood out in these discussions is, although there are some great resources on GraphQL, there aren’t many that explain GraphQL from scratch, walk you through a step by step process of setting up a GraphQL server and demonstrate building client side applications using GraphQL. I’m going to specifically try and fill this void. Along the way, I will recommend some of the best practices for designing your schemas so you can better model relationships between your types.
GraphQL is a query language that was first built and used inside Facebook in 2012. It was then open sourced and released as a specification in 2015. Back then, as Facebook’s iOS and Android apps became more complex, they started experiencing poor performance and frequent crashes. Up until that point, the content was also delivered as HTML. When they evaluated different options of making this better, it led to the project GraphQL; which decoupled the data used by the mobile applications with the server queries they required to fetch. GraphQL also shifted their focus, from thinking of data in terms of resource URLs, into a graph of objects and the models used in their apps. You can read more about this in the Facebook blog, that was published back in 2015.
In REST, each entity is uniquely identified by a resource, defined by the URL location over an HTTP protocol. When anyone requests data using a URL, the request is handled by a web server which typically authenticates this request, fetches the data against the datasource specified by the resource path, and then returns it back to the client. GraphQL is very similar to REST; it is used to fetch data from any backend service (Relational tables, NoSQL, File Systems, or even REST endpoints!), it is typed, and it also provides Statelessness and Cacheability. However, it is different in the way you fetch this data. Since GraphQL was released as a specification, you can pretty much implement a GraphQL Server in the language of your choice.
GraphQL is a useful protocol to know, even if this is your first foray into data protocols. REST is not a prerequisite for understanding GraphQL. If you have used REST for mobile or web development, you will appreciate the power and simplicity that GraphQL brings to your cloud-connected apps.
So, why use GraphQL then?
GraphQL simplifies the workflow to build client applications such as iOS, Android, React-Native. Almost every business today is going mobile and is heavily investing in building cross-platform applications. The customer interaction with the product or the service increases significantly if they have the capability to access it via their phones, leading to an increase in their revenue. There are over 2.5 billion smartphone users today, and this is projected to rise in the future. This makes it even more important for mobile apps to be responsive and with low latency. It is equally important to build these apps fast. Apps are also being built on multiple form factors and there is a need for simplifying Application Development on all these platforms. This is where GraphQL comes handy. It helps clients fetch the right amount of data needed to render the view. GraphQL lets clients define the shape of the response for each request. In addition to this, it removes the complexity of API Endpoint management on these clients, as it exposes a single HTTP Endpoint (and is usually /graphql
) to fetch the required data.
Alleviates the issues with over-fetching and under-fetching of data
Consider the following example — Let’s assume you are building an App and you want to list all NFL teams grouped by their divisions in the main view. When a particular team is selected, you then want to navigate to another view that shows the team details along with the name, position information of each active player in the roster. It seems like a fun app to build considering the Football season is just around the corner.
If you were to build this using REST, you would typically have the following endpoints:
To fetch the list of all available teams:
To fetch the details of a selected team:
To fetch the roster information:
To keep it simple, my example shows only 2 teams and 2 players within each team. When a team is selected, you would typically make 2 HTTP requests, one for fetching the team info, the other for fetching the roster information for this team. You would then combine these results on the client to render the view.
We are in a good shape so far, so we ship this application and then become a millionaire :)
Over time, there is an increase in demand to build a Single Page Website for our application. As the form factor for a webpage is significantly higher than a mobile screen, we will ideally want to display additional information for each view. For example, we may want to display additional information (such as height and weight) about each player in the roster view.
It is now easier to add these additional properties to the roster endpoint (from the above example), to return a list of players with all the required fields (including height, weight, etc.).
- The endpoint for fetching roster information will now look like:
We have now introduced a problem with our mobile application as it does not need these newly added fields, so they’ll still be fetched, but ignored. This leads to over-fetching of data, resulting in network delays and latency issues in your mobile app. As you build the same application in a different form factor (say for example on a Smart Watch), you will end up fetching more data than needed in one or more of these devices.
On the other hand, there can be an issue with under fetching of data. This is a very common issue with any 1-N relationship between the entities in your data. Your parent entity will be fetched over an HTTP endpoint, and this will typically contain a list of child entities. You would then lazily load each child entity in a separate request. So one query issued to the parent resource will lead to an additional N requests to fetch the related child resources. For instance, the above example is optimized for our application use case. However, consider that your team info endpoint was designed in the following manner:
- To fetch the details of a selected team:
Now, you wouldn’t need the roster endpoint like before, instead, you would have an endpoint to get player details for a given player id. This endpoint is returned as part of the team
response.
- The endpoint for fetching player details:
This will lead to fetching the team details using one HTTP request, and for each item in the players
response, you would then make a new HTTP request to fetch the details of each player. This behavior is a gist of the famous N+1 problem that exists with fetching data with 1-N relationships. If you are interested, lookup the N+1 problem online. There are plenty of good resources that talk about this problem and the client-side latency and performance issues with this behavior.
API Endpoint Management
You could argue from the above over-fetching example, that you can overload an additional endpoint that returns different data targeting individual form factors (For example api/web/teams/1/roster/
, api/mobile/teams/1/roster/
, etc.). While this is true, both your server and the clients, now have to maintain and manage an additional endpoint. In some cases, you may fetch the data from an external endpoint which you have no control of. With REST, the client is aware and is saddled with the management of endpoints for every resource in their app. As the complexity of your application increases, it becomes very hard to maintain and manage a fleet of endpoints.
Additionally, it becomes a responsibility of the client app to know what the shape of the payload defined by the server is. It would rather be helpful for clients to specify what fields they need, to render the data for that particular view. With GraphQL, you just get one Endpoint for your entire application. If you’re wondering how this works, we’ll go over that in detail in the next chapter.
Avoids multiple roundtrips to fetch hierarchical or related data
As you may have noticed in the above example, when a particular team was selected, we first had to fetch the team details, and then query on the roster information for this team and tie them together on the client. You could in fact make two requests to fetch in parallel, but it is not always possible. Imagine that in the application above, you now want the users to multi-select a particular team, so you can compare the team rosters. Now if it were REST, you would first fetch the details of the selected team ids, and then for each selection, you would in turn make a network call to fetch the roster information. This leads to multiple roundtrips to the server, directly impacting the latency and UX of your application. Using GraphQL, you can simplify this workflow, by constructing a single nested query, where you fetch all required information in a single roundtrip! The responsibility of fetching related data is pushed on the server side.
Just so you know how a GraphQL query is structured, the client could construct the following query to fetch the roster info of the selected teams using a single network call:
Don’t worry too much if you don’t understand the above syntax, we’ll go over this in more detail in the following articles.
Conclusion
To summarize, GraphQL is meant to be used for client applications, where network bandwidth and latency are critical. It provides clients, the ability to query an object graph (a hierarchical structure of related objects). Using GraphQL, clients also get to choose what fields need to be included in the response. This makes it a whole lot simpler and easier to use and manage data fetching on the client’s end.
GraphQL is not a replacement for REST, rather it complements well with REST and can be used for specific use cases. In the next article, I will be talking more about REST vs. GraphQL and give you a guidance on when to use what. So have an eye on my next article!