GraphQL Load testing

Published in

Moodah POS

4 min readDec 6, 2019

measuring performance of a system

src: https://www.kiwiqa.com/getting-to-know-the-fundamentals-of-performance-testing-a-guide-for-amateurs-and-professionals/

A functional application tested by unit tests is not enough. On production, there will be maybe hundreds of people accessing at the same time. It would be bad to just hope application’s capability can cope with user demands. That is why there are some nifty tools you can use to simulate actual load. In this post, I will be sharing findings regarding testing performance of my group, Moodah POS, that uses a middleware GraphQL server connected to Odoo, an open source ERP system as backend with RubyH.

There are two kinds of these test: load and stress test. A load test performs expected load of application to check whether system can handle it with minimal performance degradation. Whereas stress test overloads the application, possibly to a point of failure to see how system handles and recovers from it.

For GraphQL, there is a JavaScript library called easygraphql-load-tester to help do load testing. It uses artillery or k6, both load testing tools, under the hood and wraps it to work with GraphQL seamlessly.

Setup

Here, I choose artillery because output contains more information compared to the latter. To do load test, two files need to be prepared.

artillery.yml sets server endpoint and load phase that defines how many new virtual users will be generated in a time period. Under payload, I used a CSV file to supply session token and place it in request header. Under phases, we define our load, below will run 10 requests each second for 10 seconds.

index.js reads server schema and automatically generates queries used to test load. Which query to test and their arguments can be specified. Moodah POS uses GraphQL JSto build its schema, to generate a graphql file, simply use printSchema utility function that accepts a schema object and write output string to a file.

Last step is to run the test script with artillery run artillery.yml. I tested two queries: paymentMethods which is small query and posConfigs which has more fields and belief that n+1 query problem.

N+1 query problem

GraphQL only makes one trip between client and server, however number of trips to database or a back end server should also be considered. N+1 query problem means querying n items with child fields of parent child model relationship requires n+1 database requests. This may happen due to the information flow of nested resolvers in GraphQL.

Findings

Upon load testing two different loads, here are my findings.

paymentMethods arrival rate 10

All virtual users finished
Summary report @ 19:30:36(+0700) 2019-12-05
  Scenarios launched:  100
  Scenarios completed: 100
  Requests completed:  200
  RPS sent: 9.35
  Request latency:
    min: 984.5
    max: 10469.3
    median: 2464.4
    p95: 9575.4
    p99: 9973.1
  Scenario counts:
    GraphQL Query load test: 100 (100%)
  Codes:
    200: 200

Output above tells us 100 virtual users created, total of 200 requests fired with 95% queries completed under 9575ms and all of them returning 200 success code. Request latency is higher since in our case, queries send network calls instead of direct database queries.

paymentMethods arrival rate 100

All virtual users finished
Summary report @ 20:04:20(+0700) 2019-12-05
  Scenarios launched:  1000
  Scenarios completed: 1000
  Requests completed:  2000
  RPS sent: 95.24
  Request latency:
    min: 997.2
    max: 9710
    median: 3110.4
    p95: 7447.4
    p99: 8743.6
  Scenario counts:
    GraphQL Query load test: 1000 (100%)
  Codes:
    200: 1871
    502: 129

In higher load, some queries start to fail indicated by returned response codes. Upon closer inspection, here is a response that returned 502 (Bad Gateway). After reading AWS docs, one cause for 502 is origin rejecting traffic on server ports.

Thu, 05 Dec 2019 12:22:16 GMT http:response {
  "content-type": "application/json",
  "content-length": "36",
  "connection": "keep-alive",
  "date": "Thu, 05 Dec 2019 12:22:16 GMT",
  "x-amzn-requestid": "db1d903f-e86d-4a4f-8a39-da3e7d477c2a",
  "x-amz-apigw-id": "EOtx1Ew5oAMFRHQ=",
  "x-cache": "Error from cloudfront",
  "via": "1.1 0230bfe4b11b7df94cc75eb42cc72778.cloudfront.net (CloudFront)",
  "x-amz-cf-pop": "SIN2-C1",
  "x-amz-cf-id": "rEMWyYblc_ZP4VsTiilC6tdzgCdvjzoxRwgrdIs5-I5Yi6yYEIgjow=="
}
Thu, 05 Dec 2019 12:22:16 GMT http:response {
  "message": "Internal server error"
}

posConfigs arrival rate 10

All virtual users finished
Summary report @ 19:22:25(+0700) 2019-12-05
  Scenarios launched:  100
  Scenarios completed: 100
  Requests completed:  200
  RPS sent: 10.91
  Request latency:
    min: 1464.5
    max: 7254
    median: 3234.8
    p95: 6819.7
    p99: 7128.9
  Scenario counts:
    GraphQL Query load test: 100 (100%)
  Codes:
    200: 150
    502: 50

posConfigs arrival rate 100

All virtual users finished
Summary report @ 19:40:06(+0700) 2019-12-05
  Scenarios launched:  1000
  Scenarios completed: 1000
  Requests completed:  2000
  RPS sent: 70.03
  Request latency:
    min: 4450.9
    max: 15032.6
    median: 8036.3
    p95: 11250.9
    p99: 14201.9
  Scenario counts:
    GraphQL Query load test: 1000 (100%)
  Codes:
    200: 189
    502: 1811

From these metrics, it can be seen that performance of our GraphQL server is still lacking. Improvements may be achieved by resolving said n+1 query problem, scale server or moving it closer to where back end is hosted by RubyH. That is all from me, may this brief idea of testing performance be useful for your projects. Cheers~