Monitoring GraphQL like REST

Meir Levi
Fiverr Tech
Published in
4 min readJan 26, 2022

How we are monitoring our GraphQL requests.

When we started thinking of moving forward from REST to GraphQL, the main concern that was in front of us was, “How are we going to monitor it?”

In GraphQL, you have only one entry point to the server. This entry point serves all the queries and mutations that exist in your schema.

How can we know which query increased the response time? Which query started to return errors?

The Internet search didn’t help us much. We found a lot of articles explaining how difficult it is (some even saying impossible) to monitor GraphQL requests. We knew we had to find a way to determine the health of our GraphQL queries, just like we monitor the health of REST calls.

Assumptions

Comparing GraphQL to REST, we wanted to have:

  1. A unique name for each query
  2. The response time of the query
  3. The amounts of queries
  4. The amounts of queries errors/ success.

Apollo plugins

Plugins enable you to extend Apollo Server’s core functionality by performing custom operations in response to certain events. Currently, these events correspond to individual phases of the GraphQL request lifecycle, and to the startup of Apollo Server itself.

We used 2 simple Apollo plugins:

  1. requestDidStart
  2. willSendResponse

requestDidStart

The requestDidStart event fires whenever Apollo Server begins fulfilling a GraphQL request.

willSendResponse

The willSendResponse event fires whenever Apollo Server is about to send a response for a GraphQL operation. This event is triggered(and Apollo Server sends a response) even if the GraphQL operation encounters one or more errors.

How does it work?

Our GraphQL service is written in typescript using the NestJS framework. We created a simple MonitorRequestPlugin in our service that used those two Apollo plugins I mentioned above.

The basic structure of it looks like this:

Base plugin structure

Now, when we have this simple plugin, which can have its logic at the request start/end, we can extend it and add to it more capabilities.

Operation name

We can pass an optional parameter to the willSendResponse event — the requestContext. The requestContext contains the entire request context: the operation name, the response, an error array, and more. You can read more about the requestContext here.
We gave each query a unique name: a name that represents the query of a specific screen, which is called: operationName.
We will start by obtaining the operation name of the query.

Extract operation name from requestContext

In this code you can see that we passed the request context to the willSendResponse event, and we took from it the operationName of the query.

Response time

Now that we have the operationName of a specific request, we can calculate its response time.

Calculate the response time of the query

Using the requestDidStart event (query start), we saved the high resolution timestamp of the start of the request, and we subtracted it from the high resolution timestamp of the willSendResponse event trigger time (query done). We now have the total time that the query took to run.

Queries amount

Similarly, we can report the requests that our service handles to our stats infrastructure by the operation name in the willSendResponse event.

Errors handling

In GraphQL, the errors sit in the errors field of your response, but the http status code will always be 200, for both success and failures.

Clients can understand which field caused the error by looking at the errors array, and they can still use and show partial data from the response. However, we still require a REST-like interface so we can monitor the health of the service at the backend level.

As I mentioned before, the requestContext found in the willSendResponse event contains the entire request context, including any errors the query received. From the errors array field, we can determine the REST-like status code that should be reported to our metrics.

Extract the errors

We iterate over the errors array, save the “real” status code, and report it to our stats infrastructure.

So, finally, our plugin looks like this:

Monitor plugin

Conclusion

Monitoring graphQL requests requires some thought before implementation since it is not something that comes out of the box.

In our case, we chose to monitor our GraphQL requests as if they were REST requests. Note, however, that there are many other ways to monitor the requests, such as monitoring each resolver or tracing the entire request. Choose the methodology that fits and serves your needs.

Fiverr is hiring in Tel Aviv and Kyiv. Learn more about us here.

--

--