UPDATE: scroll all the way down to see a short video I made about how to run graphQL from a nodejs app.
This article is targetted to people who want to discover GraphQL, the query language developped by Facebook. I will try to highlight why the current REST stategy has some serious lackness a flexibility with real world exemples and why graphQL offers so much more granularity.
So you heard about graphQL and maybe you have already done some experiment with it — if you have some struggle, or you feel like you missed a little something to see the big picture, this article may help you. From now, I will consider that you have some skills with nodejs, but if this is not the case, you may find some interesting stuff anyhow.
If you have some thought about all this, please leave a comment.
GraphQL is language independant, it doesn’t care about the code of your app or your database. This is more a way to ask
Hey ! I want those data, please send them to me in this shape
Any piece of code can request some data, and any server can serve the response, what really matters is the query (how your client requests the data) and how your server will understand this query
Hopefully, there is an npm module to handle most of the work. GraphQL is based on concepts concept like Field, Type and Schema, this module will help you to link them all
You may already know that GraphQL is used to request data from a remote API, but we have already REST for the same purpose, why should you consider using something else to achieve the same work ?
Well, allow me to do a quick review of REST
REST
REST is a technic to request data from a remote API, it takes advantage of the HTTP protocol using both transport and URL. The URL describes the ressource you want to fetch and the body is the response you get. It’s very like browsing a web page with your browser.
The client side is a “nobrain”, you just request an URL. Let’s say you want to get some data from a movie database, you may end with this request:
GET http://moviedatabase.io/movie/123
On the server side, this is pretty simple too, you have to figure out
- what kind of request this is (getMovieFromId or something like this)
- get the specific ID of this movie (in our case 123)
- do some internal work to get the real data (connect to a database maybe)
- send the response in a common format like JSON
This is also language independant, because both client and server can be written in any language you want. The only thing the client have to do is making a http request, and the only thing the server has to do is listening for a request and send back a response with some data. This job is commonly done by a router.
This is perfectly fine, a lot of services work this way, and you will use REST in the next years for sure
But REST is not perfect
There are two potential problems here:
- A: You know you will receive data about a movie (that’s what you ask for), but you will probably receive more data than you really want, like a last update timesamp, or a list of pictures related to this movie.
Maybe you just want the main title and the poster image. - B: You want more detailed information about the actors like their picture instead of just their name
You could end with, at the same time, too much data & lack of data
A naive approach to deal with those problems…
The first problem is easy to ignore, because, if you only need the title, just use it and don’t care about the rest, you may have received a big payload, but… just use what you need. No big deal.
The second problem is also easy to solve, by sending more requests after the first one to get all the actors pictures. Asuming you have 5 actors you will have to request 6 times… but who cares, you will have your picture and you will be happy
So what’s wrong ?
The network is slow, even very very slow
Whenever a remote request is made, a bunch of things happen in the background like DNS resolution, SSL encryption, socket handling, latency, timeout managment … a lot of things you do no really care about. It’s like a black box, you have no idea what is going on, and timing may change every request
For every request you do, you will have to repeat this (not you, but the system that runs your code), over and over and you want to speed up the process as fast as you can.
You could optimise your server response time, and you could optimise your network but the thruth is: if you have to request 6 times your server, you will have to wait 6 time, even if your network is blasing fast 6 will always be more than 1.
Can we simply pass some extra data in the URL to get the actors data back with the movie data, to avoid the extra requests ?
Good idea ! Let’s try something like…
http://moviedatabase.io/movie/123?withActor=true&actorField=picture
Tada ! If you handle the withActor and actorField parameters somewhere in your app, you will remove the extra queries… But what if you need more different data ? you will probably end with a very ugly, unreadable URL very specific to this request… and do you remember the problem A ? now you have the full movie data, decorated with all actors data…
GraphQL to the rescue
GraphQL will enforce you to make only 1 network request with a description of what you want back. This is called a query, and the query will list all the data you want in an elegant and smart way : a tree
query{
myMovie: getMovieFromId(id:123){
title
}
}
A query is a simple string which looks like a “tree”. The title is the deepest element, let’s call it a leef. myMovie, on the other hand, acts like a branch, it’s a field too, but more like an entity or an object. All the fields below, describe this movie
what does the server send back ?
{
"data": {
"myMovie": {
"title": "Star Wars"
}
}
"error": {
// if your do correctly job job, you will never see me !
// — the error message
}
}
The structure of the response is very similar to the one of the request.
The data root field is always used to wrap the real data (because you could have an “error” field at the root too to expose some errors). “myMovie” key is whatever I choosed in the query, this is not mandatory, but could be meaningful in this case (myMovie1, myMovie2 …)
If you make a REST request, you have to know the URL.
In GraphQL you have to know all the fields names, and the arguments to pass for some of them.
In a way, this is like using SELECT *
in REST and SELECT field1, field2…
with GraphQL
Don’t be fooled, a field can be empty if there is no value… the only guaranty you have is the field will be in the response.
Hello from the Server side
It’s time to see how the server handles all this. I will focus on Javascript because this language suits very weel the GraphQL philosophy
The first thing you have to define is a Schema, this is like a manifest. It defines a list of fields you can request and their respective Type
Note: There is something I have misunderstood a long time before the haha moment, I was very confused about : Every thing is a field. I mean litterally. Each level in your query are fields. The deepest fields (the leaves) must be primitive (string, number etc…) the fields between them and the root are there to structure the data, giving sense and order
A type refers to a data, the most basic type is a primitive (string, number, boolean etc…). When you define the movie title to be a string, you are using the String type. But what about the movie itself ? is it an object or someting else ?
Movie is a custom Type. It’s like a definition of “what is a movie”.
Well, a Movie contains a title String, and a list of Person… ok but what is a Person ?… Well a Person is a custom Type with a name string and poster string… you get the idea, it’s like composition
Movie and Person are both Custom Types and they are defined by a name and a list of fields.
No more talking, show me some code !!!
Let’s go back to our first example and just add a dumb field named date
query{
myMovie: getMovieFromId(id:123){
title
date
}
}
If the Movie Type does not implements the date field… graphQL will yell at us screaming date is not a known field, the validation fails, the whole process is aborted… same player shoots again
Let’s go deeeper and create a brand new Movie Type
const Movie = new GraphQLObjectType({
name: 'Movie',
fields: {
title: { type: GraphQLString }
}
})
Movie is just a type of object, you can’t really use it, so let’s create a field which use the Movie object
const getMovieFromId = {
type: Movie,
args: {
id: { type: GraphQLString }
},
resolve: (obj, args, root, ast) => {
return {
title: "Matrix"
}
}
}
Now we have an object named “getMovieFromId” which has the Movie type — Again this is a simple object definition, not yet really useful. But we also have defined two importants things
- “args” which is a list of arguments we can use for this field
- “resolve” which is a function that will return the data/value for this field
Finally let’s create our schema which behaves like a trunk.
const schema = new GraphQLSchema({
query: new GraphQLObjectType({
name: 'RootQueryType',
fields: {
getMovieFromId
}
})
})
This is the root of every query, because you can’t access a 2nd level down branch directly, you have to start from the RootQuery.
Let’s go back to our initial query
query{
myMovie: getMovieFromId(id:123){
title
date
}
}
here we ask for a couple of things:
- we access the field “getMovieFromId” with a parameter (id:123)
- we list somes fields to get back (title and date)
- we assign the value of the field getMovieFromId to “myMovie”
On the server side, when graphQL will parse our query, some checks will be made on the rootQuery.
- Does “getMovieFromId” exist as a field in the rootQuery → yes
- What type “getMovieFromId” is ? → Movie
- Does “getMovieFromId” have a resolve function and check the arguments, do they match ? → yes
- Execute this function with the arguments
- Get the value returned by the resolve function, and do the same checks one level down
- …
- Does title and date are parts of the fields defined in a Movie → no
- … you got the idea
Every time graphQL meets an object the same process starts over again with this new object, to finally deliver the leef of the query.
To be clear, if you request this
query{
myMovie: getMovieFromId(id:123){
title
date
images{
url
}
}
}
and the resolve function of the field “getMovieFromId” does not returns any images data, your response will be
{
"data": {
"myMovie": {
"title": "Star Wars"
"data": 2016
"image" : null
}
}
}
The result of the resolve function is used by graphQL to continue the process.
From top to bottom, every key/value pairs returned by those functions will be treated as fields
Nested fields and related data
You have now a simple query to get the title of the movie, but what about the actors ? The actors are a list of Person, we need to enhance our query to get their name and poster
query{
myMovie: getMovieFromId(id:123){
title
actors{
name
poster
}
}
}
now we have actors{} which reflects a list of actors, and for each of them, we want a name and an image. But the Movie type does not define the actors field right ? so let’s fix this
const Movie = new GraphQLObjectType({
name: 'Movie',
fields: {
title: { type: GraphQLString },
actors: { type: new GraphQLList(Person) }
}
})
actors is now a field, a list of Person, or like an array of Person
but what is Person ? remember, it’s a custom type… so let’s create this Type
const Person = new GraphQLObjectType({
name: 'Person',
fields: {
name: { type: GraphQLString },
poster: { type: GraphQLString }
}
})
Now, when we use Person, there is a meaning, because we know what fields are inside, this is reusable, and composable, a more detailed Movie Type could be
const Movie = new GraphQLObjectType({
name: ‘Movie’,
fields: () => {
title: { type: GraphQLString },
actors: { type: new GraphQLList(Person) },
directors: { type: new GraphQLList(Person) },
composers: { type: new GraphQLList(Person) },
...
}
})`
Because actors, directors and composers are all lists of Person, we can request them the same way
query{
myMovie: getMovieFromId(id:123){
title
actors{
name
poster
}
directors{
name
}
composers{
name
}
}
}
Here director and composer will not receive the poster field… even if this is allowed
Request related data
So far we only get the title of the movie, because the getMovieFromId field only returns the title. How do you get the other data ?
Any field can have a resolve function. If you do not provide one, graphQL will simply returns the value. This is the case for leaves fields, as they are dumb values (primitive), they are just returned as this.
Let’s change our Movie Type to get a list of actors, from now we will only focus on them, composers and directors are wiped out.
const Movie = new GraphQLObjectType({
name: 'Movie',
fields: {
title: { type: GraphQLString },
actors: {
type: new GraphQLList(Person),
resolve: (_, args, root, ast) => {
// get the list of actors from the databse or return a simple []
// yes all my familly was in this movie !
return [
{name: “Benjamin”, poster: “benjamin.jpg”},
{name: “Julie”, poster: “julie.jpg”},
{name: “Anna”, poster: “anna.jpg”},
{name: “Louisa”, poster: “louisa.jpg”}
]
}
}
})
From the resolve function you can access the “parent” field from the first argument here named “_”, _.id will return the movie.id for instance
You can also get access to the current field name with ast.fieldName
and the corresponding value with _[ast.fieldName], it’s is useful
when you do not really know which field you are on (don’t forget
you can extract the resolve function and reuse it on different fields)
It’s important to notice a couple of things:
- The first query to the database returns only the title (and ok, maybe the id also), there is no “actors” fields envolved here
- the actors fields come from the definition of a Movie Type, even if the list of fields does not really mirrors the database fields — it’s ok and not very important. You can enhance your movie data with some computed fields like actors
- if the query to the database returns a field named “actors” with some data, those data will be replaced by the values returned by the resolve function. You can take advantage of this or not, this is up to you.
Using arguments
So far so good, we have now a movie with a list of actors, but if we only want 2 of them, and the server give us all, what can we do ?
The resolve function of the actors field returns the data, all the data. If you want less data, it’s up to you to limit the amout of data returned. actors is an array of Person, so like every array you can use Array.splice and return only what you want.
...
resolve: (_, args, root, ast) => {
const list_of_actors = [{"name": "Benjamin", {}, {}…]
return list_of_actors.splice(0, 2)
}
...
but this is not very useful because the number of Person returned is hard coded, let’s use args instead
...
args: {
limit: { type: GraphQLInt }
},
resolve: (_, args, root, ast) => {
const list_of_actors = [{"name": "Benjamin", {}, {}…]
if(!args.limit) return list_of_actors
return list_of_actors.splice(0, args.limit)
}
...
now you can use this function with no argument to get the all list of Person, or you can simply specify a limit to get less Peson.
Graphql is strongly typed, guess what, here you will be happy that args.limit will always be an integer (or null), but never a string… simply because Array.splice() does want anything else than integer !
how does this looks from the query side ?
query{
myMovie: getMovieFromId(id:123){
title
actors(limit:2){
name
poster
}
}
}
because limit is expected to be an integer, if you try something else, grapqhQL will yell at you — limit should only be an integer or not exist at all.
Using GraphQL in my day to day work
My first look at graphQL was almost a year ago. I was very impressed but i didn’t see the potential at the time. A couple of weeks ago, I was worried about the time took by some queries on yo-video.net website and the payload of them.
For the home page, where you can find movie in theaters, the response from the server was about 1Mb… even if the request/response is actually between 2 servers (not from your browser) this large amout of data was very memory consumming to parse and extract.
GraphQL was a pretty good solution to this problem and both memory and time used were shrinked. It was a smart move.
I started this article complaining about REST, but i still use it a lot in this website. On a movie detail page for exemple, we need almost every fields, so, using REST to get those data was simple.
A movie is about 60 different fields, (layered under maby levels)… so if you need them all, the associate GraphQL query will not be very light to write. In this specific case and in my opinion, sticking to REST would be time saving.
I recommand using GraphQL IDE which is so much better than graphiql with a lot of very usefull options like saved queries by project, tabs, headers… you should have a look
As I am a big fan of egghead.io courses, Idefinitively recommand this serie https://egghead.io/courses/build-a-graphql-server
The content is restricted to pro user, but this is worth a subscription
Thank you dad, for the proofreading and advice
Bonus
I made a 3 minutes video to execute a GraphQL query from a nodejs app.