Often when you develop an application performance matters. The most important asset you have when it comes to perceived performance is the response time of your backend services. If you use GraphQL it is most likely the system that comes right before the user-facing application, so it should be the central part to take a look at when it comes to performance.
If you only use a GraphQL backend and you have no other backend systems and no frontend under your control read no further. Apollo GraphQL comes with a built-in tracing solution for this use case, their Performance Analytics. It gives you a lot of insights about GraphQL, but only about GraphQL, this is why I would like to show you a second option, that allows you to include performance data from all your systems: apollo-opentracing
Apollo offers this great concept of extensions to introduce cross-cutting functionality to your GraphQL server. As a user of an extension, you just need to add it in an array as a function that returns an instance of the extension.
In our case, we initiate the Opentracing extension with two tracers, one for the GraphQL server internal traces called local and one for the traces the application receives in its role as a server.
Ideally, you could also add your frontend as the first layer, seeing exactly which action in which frontend took how long and went to which system.
With this in place, you make it very easy for everyone in your organization to reason about performance and you enable new developers to understand your software system by exploring it, while also getting a feeling for the timings of each service.
Sounds great, what else do I need to know?
Here are some small insights that may help you make the most out of this extension:
Which Tracers can I use?
- jaeger and this is a great intro article by RisingStack
What does it cost?
In performance tracing you always pay for insight with actual performance. Different tracers might have different performance characteristics, but in general they need to make HTTP requests in order to store their tracing data and this puts at least a little bit of stress on your system.
The way that the tracing community decided to go with this is to add the concept of samplers. A sampler is a function that decides if a span should really be traced or not. By only tracing a certain percentage of incoming requests you can still get enough data that allows you to reason about performance easily, without putting the stress on all of your customers.
In development I always work with a 100% sample rate, I want to see what is going on in my system.
What can I learn here?
- You can learn about performance tracing (see further reading) and why it might help you
- By looking into the very minimal source code of apollo-opentracing you can learn about all the API apollo provides to hook into different lifecycle events of resolving a GraphQL query
If you are interested in diving into performance tracing, here are some introduction blog posts I wrote: