Introduction to GitHub GraphQL API

Viduranga Gunarathne
vlgunarathne
Published in
10 min readMar 26, 2018

This article will cover the following topics:

  1. GitHub
  2. GitHub APIs
  3. What is GraphQL?
  4. Advantages of using GraphQL
  5. Using the GitHub GraphQL API v4
  6. Obtaining an access token
  7. Writing GraphQL queries
  8. GraphQL variables
  9. GraphQL Pagination
  10. Running GraphQL using POSTMAN

“Git” in GitHub

Git is an open source version control system introduced to the world by non other than, the creator of Linux; Linus Torvalds. It is similar to other version control systems such as Subversion and CVS. Git version control system enable developers to collaborate in developing software applications and also conveniently manage versions and releases of those applications by storing the modifications in a central repository.

“Hub” in GitHub

Although Git is such a rich and a promising version control system, all of that happens in your local machine (local repository). But when you do a collaborative project, you need to share the project files with other developers and specially when you work on open source projects, you need some place to host the project files, so that other developers in the community can get access to them and provide their contributions. This is the place where the “Hub” jumps in. It is an online centralized repository (remote repository) where you can push all the work that you have done and others can see and download.

GitHub APIs

We live in a world where almost all web services are driven by APIs (Application Programming Interfaces). Most of the online service providers provide their services to external and internal users through APIs. This is no exception to GitHub as well. Although the local repository can be managed by a command-line tool, the hub provides all the services through an API. As of today, GitHub offers two API versions; REST API version 3.0 and GraphQL API version 4.0 where the latter uses the recent GraphQL which was developed by Facebook as a new query language for fetching application data. Although the name seems to imply it, GraphQL is not similar to database query languages such as SQL. It is an application level query language that can have any backend data management implementation — MySQL, MongoDB, PostGreSQL etc.

We should think of GraphQL as a better alternative to REST.

So, in this article we will take a look into the basics of the GitHub GraphQL API v4.

What is GraphQL?

GraphQL is an entirely new way to build and consume an API. So far, REST APIs have dominated the domain space and GraphQL is a new kid in town that has gained a considerable attention. In my opinion, the reason for that is because, it aligns with the idea that we have of the hierarchy of data in any given data source and mostly developers like to see data as graphs of objects.

REST design had this idea of data hierarchy;

“http://<org_host:port>/{department}/employees/{employee_id}”

The above REST API call would return a single employee object with the specified employee identification, in the specified department in an organization. However, the consumer of this API (i.e. the person who sends the http request to this resource) has no control over the information that he/she receives.

With GraphQL, the above API call would look like this;

query {
department(name: "engineering") {
name,
employee(id: "3500401") {
name,
age,
address
}
}
}

And the response would be;

{
"department" : {
"name": "engineering",
"employee": {
"name": "John Doe",
"age": 35,
"address": "50, Wilfred Avenue, Colombo 07"
}
}
}

Here, the API only gave us the information that we requested.

Advantages of using GraphQL:

  • Get what you need : A GraphQL API will only provide you information that you explicitly state in the query.
  • Nested data : GraphQL queries can be written to obtain nested data across relationships. Without this, it would require multiple HTTP calls to get the required information.
  • Strongly typed : Another issue of the REST API is that you cannot have a prior insight to the format of the response. In GQL, you exactly know what the response would be, because it corresponds to the query that you made.
  • Introspective : A GQL server itself can be queries to get an idea of the queries it supports.

So, it’s time to get our hand dirty. Let’s see how we can use this knowledge to consume the GitHub GraphQL API v4.

Using the GitHub GraphQL API v4

In this section, I will use the GitHub GraphQL Explorer which is an excellent application developed by the GitHub team to do various API calls and experiments on top of the GitHub GraphQL API and later we will look into how we can make the same API calls from a third party application, such as POSTMAN.

GraphQL Explorer

Obtaining an Access Token

Before doing any of this, you will need to obtain a GitHub API access token.

For that, you need to go to your profile settings.

Then, to Developer settings.

After that, obtain a personal access token. You also have the options to create an OAuth app or a GitHub app.

Select the necessary scopes that you need access to, before generating the access token and finally hit the “Generate” button. Then copy down and save the generated token some place safe, because you will not be able to see the token again once you leave the page.

Now that we have a brand new access token, we can start playing around with the API.

In the GraphQL Explorer, the left-hand-side is for us to write the query and the right-hand-side shows the result.

Writing GraphQL queries

Let’s write our first query;
This is to fetch a single organization that has it’s repositories on GitHub.

query {
organization(login: "facebook") {
name
url
}
}

Here, we ask the API to retrieve information about an organization by the name “facebook” and we specify that we only need the name and url of the organization. Go ahead, try it out! You will get a response like this;

{
"data": {
"organization": {
"name": "Facebook",
"url": "https://github.com/facebook"
}
}
}

In this example, “organization” is an object and “name, url” are fields. In comparison, the REST API will return some 30+ fields. “login” is an argument and the REST equivalent would be query parameters.

Next, let’s see how we can fetch a repository under this organization;

query {
organization(login: "facebook") {
name
url
repository(name: "react") {
name,
url
}

}
}

The output would look like this;

{
"data": {
"organization": {
"name": "Facebook",
"url": "https://github.com/facebook",
"repository": {
"name": "react",
"url": "
https://github.com/facebook/react"
}

}
}
}

A repository belongs to an organization. So naturally, it is nested under an organization. Next, we know that pull requests are something that is inside a single repository. So, lets see how we can get a list of pull requests in the above repository.

query {
organization(login: "facebook") {
name
url
repository(name: "react") {
name,
url
pullRequests (first:5) {
nodes {
title
}
}

}
}
}

And the result would be;

{
"data": {
"organization": {
"name": "Facebook",
"url": "https://github.com/facebook",
"repository": {
"name": "react",
"url": "https://github.com/facebook/react",
"pullRequests": {
"nodes": [
{
"title": "Run each test in its own <iframe>"
},
{
"title": "[docs] Fix button links on bottom of home"
},
{
"title": "[docs] Fix couple minor typos/spelling"
},
{
"title":"[docs]Improve\"EventHandling\"documentation."
},
{
"title": "Fix links in root README.md"
}

]
}
}
}
}
}

Note the argument “first : 5” This tells to fetch the first 5 pull requests in the list of all pull requests. Similarly you can specify “last : 5” to get the last five records, and GitHub allows a maximum of 100 records for each query that you make to its API.

# pullRequests (first:5, states:OPEN) {}
# pullRequests (first:5, states:[OPEN,CLOSED]) {}

You can also specify the states of the pull requests based on your requirement of information.

GraphQL Variables

Apart from hard-coded values, GraphQL also allows us to use variables.

query ($organization: String!, $count:Int!){
organization(login: $organization) {
name
url
repositories (first:$count) {
pageInfo {
hasNextPage,
endCursor
},
nodes {
name
}
}
}
}
====================================================================
Query Variables
{
"organization": "facebook",
"count": 3
}

“$organization” is a variable of type “string” and “$count” of type “int” and the values to these can be set under the “Query Variables” section of the explorer.

GraphQL Pagination

Earlier, I told that GitHub API only responds with a maximum of 100 records per every API call. However, there can be obviously more than 100 records that we need to obtain from the API. So how can we do this ?

GitHub API has a solution. That is called Pagination. This GraphQL feature allows us to traverse through all the pages of results and obtain all the information that we need with ease.

When you obtain a list of records from an API call (for simplicity, let’s say 5 records) out of a total list of 20 records, the GraphQL API will provide you with a “cursor” which will be a pointer to the beginning of the next set of records. Therefore, by simply passing this cursor in the next request, you can easily obtain the next set of records, which in turn will have another cursor pointing to the next set and so on …

Let’s see how this works;

For this we need to add the following object to our query.

pageInfo {
hasNextPage,
endCursor
}

hasNextPage : Let you know if another page of records are available
endCursor : This is the pointer to the next set of records.

Now, let’s write a query with this “pageInfo”

query {
organization(login: "facebook") {
name
url
repositories (first:3) {
pageInfo {
hasNextPage,
endCursor
}
,
nodes {
name
}
}
}
}

Here, I have limited the list of repositories to 3 so that there will definitely be a next page of records. The result of the above query would be;

{
"data": {
"organization": {
"name": "Facebook",
"url": "https://github.com/facebook",
"repositories": {
"pageInfo": {
"hasNextPage": true,
"endCursor": "Y3Vyc29yOnYyOpHOAAigsg=="
}
,
"nodes": [
{
"name": "codemod"
},
{
"name": "hhvm"
},
{
"name": "pyre2"
}
]
}
}
}
}

Note that we have got “hasNextPage : true” and
“endCursor : Y3Vyc29yOnYyOpHOAAigsg==”.

Now write another query by adding the above cursor to get the next set of results;

query {
organization(login: "facebook") {
name
url
repositories (first:3, after:"Y3Vyc29yOnYyOpHOAAigsg==") {
pageInfo {
hasNextPage,
endCursor
},
nodes {
name
}
}
}
}

Running GraphQL using POSTMAN

Now let’s see how we can make an API call from POSTMAN.

POSTMAN Application

First, select the HTTP method to “POST” because we need to send the GQL query to the API endpoint as a json payload. Next, select “raw” and “JSON(application/json)” as the payload type. Also type in the endpoint URL (https://api.github.com/graphql) in the URL bar. Now, the query has to be slightly modified in order to be sent to the API.

{
"variables":{
"organization":"facebook",
"count":3

},
"query":"query ($organization: String!, $count:Int!){organization(login: $organization) {name,url,repositories(first: $count) {pageInfo {hasNextPage,endCursor},nodes {name}}}}"
}

Note that we have added the query as a value under the key “query” and the values to all the variables under a separate json object under the key “variables”. Now paste this query into the POSTMAN payload and hit “Send”.

If you’ve done everything right, the you will see a response like this;

So that is a basic overview to the GitHub GraphQL API v4. Upto this point we have covered some of the basics of GitHub GraphQL API, and certainly there is a lot more to learn. I hope this would be a good quick start guide to GitHub APIs and also GraphQL.

References

--

--

Viduranga Gunarathne
vlgunarathne

Computer Science Graduate | Software Engineer @WSO2 | Tech enthusiast | Cinephile