Discovering GraphQL endpoints and SQLi vulnerabilities

Introduction

GraphQL is an open source data query language (DQL) and data manipulation language (DML). Initially, GraphQL was developed by Facebook around 2012 and publicly released in 2015. One if its main advantages is providing a more efficient and powerful alternative to other web-services architectures (like REST).

Architecture

An interesting thing to point out is that GraphQL isn’t tied to any specific database (or storage engine for that matter) and is instead backed by existing code. What this means is that, unlike REST APIs (where the client first interacts with arbitrary code written by programmer(s) and this code reaches the database); the client first interacts with GraphQL, which in turn interacts with arbitrary code and ultimately ends talking to the database. A more useful diagram to depict this situation is this one:

Image credits: http://bearcatjs.org/graphql-versus-rest-api/.

This change in the architecture has a lot of advantages tied to it, for example it’s possible to get all the data the client needs in a single request (whereas REST APIs need to perform multiple requests).

Basics

Object types and fields

A basic GraphQL query may look like this:

{
films {
id,
name,
genre,
rating
}
}

In this example, we are asking for the id, name, genre and rating fields tied to the films object. A standard answer we could get would be this:

{
"data": {
"films": [
{
"id": "1",
"name": "Blade Runner",
"genre": "sci-fi",
"rating": 8.2,
},
{
"id": "2",
"name": "Back to the Future",
"genre": "sci-fi",
"rating": 8.5,
},
{
"id": "3",
"name": "The Shawshank Redemption",
"genre": "drama",
"rating": 9.3,
}
]
}
}

It’s interesting to note that the response we get has the same shape as the request (another of the benefits of using GraphQL, we always know what to expect).

Arguments

This is where things start to get interesting, if the underlying code allows it, we can issue requests like this one:

{
film(id: 3) {
name,
genre
}
}

Essentially what we are doing here is indicating which entry to get (id: 3) and asking for the name and genre fields:

{
"data": {
"film": {
"name": "The Shawshank Redemption",
"genre": "drama"
}
}
}

There are a lot more features GraphQL has (Aliases, Fragments, Variables, etc.). I’m obviously not going to describe all of them; the documentation goes into detail about everything and is really easy to read.

GraphQL endpoints

Don’t be discouraged if at first the web-server we are auditing doesn’t seem to have a GraphQL endpoint present. Common GraphQL endpoint paths could be (among others):

  • /graphql/
  • /graphql/console/
  • /graphql.php
  • /graphiql/
  • /graphiql.php
  • […]

Also, bear in mind that GraphQL endpoints can be as descriptive as this:

GraphiQL endpoint (fully interactive).

Or as little descriptive as this:

A not-so descriptive GraphQL endpoint.

A good way to confirm that yes, we are in presence of a GraphQL endpoint is to issue a request where we specify an invalid query. For example:

Take a look at the query parameter, we are forcing the endpoint to produce a syntax error.

If after issuing said request we get something like “Syntax Error: Expected Name, found }”we can safely confirm that we are dealing with a GraphQL endpoint.

Another important thing to take into consideration is that some of them only allow certain HTTP requests (GET / POST / etc.) so try to play around with different methods first:

POST method not allowed.
GET method allowed.

Introspection

We discovered a GraphQL endpoint and we can interact with it. Great! now what? The next step for us will be to query the schema in order to know how to talk to it. GraphQL allow this by using its Introspection system. With it, we can get information about the server’s available queries, types, fields, mutations and more.

If we want to get all the information available, we can use the built-in introspection query from GraphQL-JS. If we are in presence of a fully interactive GraphiQL endpoint (In this example, I’m using a modified version of this project) we can simply go to the < Docssection, everything we need to construct valid queries will be there:

Documentation explorer in GraphiQL (Schema).
Documentation explorer in GraphiQL (RootQueryType).

We now know that we can ask for a specific bacon object (having id as an argument), or we can ask for different baconsobjects (having type and price as arguments). Some examples:

Showing all bacons (notice that no arguments are specified).
Filtering bacons by its price.
Getting a bacon by its id.

If; however, we are not in presence of an interactive endpoint, we have several other options to obtain the schema:

  • Issuing the introspection query by hand and figuring everything out by reading the response (painful, I know). Here is the full URL-encoded payload if you want to go this route.
  • Using graphql-ide (it will fetch everything automatically).
  • Using GraphQL_Introspection.py (an excellent Python script written by Doyensec).

SQLi

As I shown before, GraphQL ends up interacting with arbitrary code written by the programmer(s). GraphQL by itself doesn’t prevent any kind of attacks, so if they made mistakes (not using parameterized queries, for example) the application may be vulnerable to SQL injection attacks. In this example, simply adding a single quote 'to the type argument is enough to generate a MySQL syntax error:

SQL injection vulnerability in a GraphQL query.

Remember that the application might not throw an error, but can still be vulnerable to blind, time-based or even out-of-band SQL injection attacks. Also, don’t be fooled into thinking that scalar types can’t be vulnerable; in a lot of cases we can simply just wrap it around double quotes "" and inject there:

SQLi in a scalar type (wrapped around double quotes).

The rest is history; we can save the HTTP request and fire up sqlmap or construct the SQLi by hand and fetch everything within our reach in the database:

Exploiting the SQLi (by hand).
Exploiting the SQLi (sqlmap).

Other vulnerabilities

GraphQL-based web applications (and endpoints alone) can be vulnerable to a lot other kind of vulnerabilities. From bypassing access controls, to sensitive data exposure, NoSQL injections and many more. If you want to read some examples and real case scenarios of these, I highly recommend this, this and this posts.

References