GraphQL Proof of Concept in Mercadona

Juan Bautista Cabota Soro
MercadonaIT
Published in
7 min readNov 22, 2021

Introduction

GraphQL is a query language for APIs and the underlying technology for retrieving information from an exposed data set.

In GraphQL, the data sets to expose are defined using schemas, written in SDL (Schema Definition Language).

A GraphQL API is a REST API that exposes a single endpoint. The GraphQL query is sent in the body, including the parameters and the exact data that you want to fetch. This allows consumers to query the exact data that they need, without the limitations of traditional REST endpoints, which have a predefined input and output.

GraphQL makes it possible to query related data. As well as retrieving the attributes of one resource, it is possible to retrieve the attributes of other related resources in the same query, in a similar way to SQL queries with joins.

It is important to mention that the exposed data can be from different data sources (e.g. different databases or other APIs).

All of this simplifies the development and maintenance of APIs, especially those with multiple consumers and use cases with very different needs, and provides great flexibility when accessing data.

Why GraphQL?

As part of our current digital transformation process, we are working on migrating various Master Data sources to SAP and redefining the access to such data from different applications.

In the specific case of product master data, we found that many Mercadona applications need to consume this data and that the needs of each application are very different. Furthermore, it is very likely that new consumer applications will emerge, that the needs of each application will change and that the data model will also undergo some changes.

Therefore, defining a catalogue of REST operations that solve all the use cases is a complex task and, in addition, this catalogue is likely to require continuous maintenance, with all the ramifications that this implies (e.g. making changes in consumer applications).

We believe that GraphQL is a technology that can help to deal with this complexity and that it makes particular sense in internal consumption APIs, in which some flexibility can be offered to consumers when it comes to obtaining information.

Therefore, we decided to test it in the form of a Proof of Concept.

REST API vs GraphQL (a priori):

REST API vs GraphQL (a priori)

Our Testing

We have carried out different tests to better understand the technology, its advantages and disadvantages, its performance (in different scenarios and compared to conventional REST APIs) and study the potential Value Contribution in Mercadona.

Two options:

  • GraphQL API with an embedded library: Spring Boot is the standard framework we use at Mercadona for Cloud applications. We have developed a preliminary version of the GraphQL API for product data using this framework, including the GraphQL Java libraries (https://www.graphql-java.com).
  • GraphQL API with Hasura: We have also used Hasura (https://hasura.io), a platform to automatically generate GraphQL APIs from one or more data sources (in our case, a PostgresSQL database). We have tested both the Community and the Enterprise version. The enterprise version provides caching and advanced management and monitoring tools.

Caches:

We tested different Cache technologies on top of the GraphQL APIs:

In the GraphQL API with an embedded library, we have tested two kinds of cache: the GraphQL cache (GraphQL DataLoaders set up with active cache) and Redis.

In Hasura, we have tested the Redis cache that can be enabled in the Enterprise version.

Architecture

Autoscaling:

To check the autoscaling capability of the APIs and the associated performance, we performed each test with and without autoscaling. For autoscaling tests, we set up a minimum of 5 and a maximum of 50 replicas, depending on CPU usage (the system scaled when the CPU use was greater than 50% on average). The tests without autoscaling were done with 5 replicas.

Queries:

The queries were the same for the different tests. We performed random requests for product data from a centre (store), section, date and language from a set of about 1,000 requests. The payload ranged approximately from 500 to 600 KBytes.

Test Conditions:

Resources:

  • Pods of 1.1 CPU, 1536 MB RAM.
  • PostgreSQL Database with 6 CPU, 22.5 GB RAM.

Concurrency: 120 users.

Duration: 15 minutes.

Repetitions: Two repetitions of each test.

Testing Tool: Locust (http://locust.io), deployed in the same cluster as the application.

GraphQL API with an Embedded Library Results:

GraphQL API with an embedded library

In these tests, we used a database read replica.

During the tests without cache, with 8–9 replicas, the database was overloaded, reaching 100% of CPU usage. Using Redis cache, we were able to free database resources and the application could manage up to 54 rps (requests per second), autoscaling up to 35 replicas.

We tested the GraphQL cache first, which uses the application resources to improve access to data. It improved performance by about 20%.

GraphQL API with Hasura Results:

GraphQL API with Hasura

With only 5 replicas of Hasura, the database was overloaded, reaching 100% of CPU usage. However, it could manage up to 60.8 rps without cache and up to 67.2 rps using Redis cache.

REST API:

To compare the performance of these different GraphQL API approaches with a traditional REST API, we developed a REST API exclusively for the tests, implementing only the needed operations for the defined query.

REST API Architecture

It is important to mention that we developed the REST API according to a preliminary design of the Swagger/OpenAPI specification contract.

In this case, to obtain the product data of each query, we had to perform several requests:

  • An initial request to retrieve the IDs of the products whose data we wanted (the N products from the store, section, date and language in the query).
  • N requests to obtain the details of each product (e.g. name, description, brand, size, etc.).

REST API Results:

REST API

Summary Table:

In the following table, we compare the best results obtained in each approach:

Summary Table

Conclusions

The first conclusion following our experience with GraphQL APIs is that, effectively, GraphQL provides a lot of flexibility in exposing and accessing data. There is no need to define and maintain a set of operations to be consumed but, instead, expose a data set and allow consumers to query it according to their needs.

In addition, the GraphQL performance tests, in comparison with REST, proved that retrieving the required data in a single query is much more efficient than performing several requests to retrieve the same data.

Regarding the development of the GraphQL API, it is important to consider the complexity of our implementation with an embedded library. Developers need to know the mechanisms that GraphQL uses for resolving queries (e.g., resolvers and DataLoaders) but, on the other hand, it allows more personalisation and optimisation.

An alternative to a self-developed GraphQL API is to use a platform such as Hasura, which automatically provides a GraphQL API from one or more data sources. In our tests, Hasura performed better, reaching up to 67.2 rps with only 5 replicas. On the other side, the self-developed GraphQL API needed 35 replicas of the application to reach 54 rps.

However, Hasura approach has other limitations. For example, it is not possible to personalise the structure of the data that we want to offer (the SDL schema is automatically generated reflecting the whole database) and, in our case, we were not able to retrieve data from different sources (several databases, other services, etc.). Moreover, deployment is more complex in terms of infrastructure. It involves the deployment, resource allocation and configuration of the new component on top of the data sources.

API Technology Comparison:

API Technology Comparison

Next steps

The following are the next steps we have proposed regarding the use of GraphQL:

  • Complete the development of our own GraphQL API to expose all Product data.
  • Deploy the API in our test environment with API management (i.e., Google Cloud Endpoints) and corporate security.
  • Test the consumption of the GraphQL API from different applications (mobile and desktop), including the use of subscriptions for automatic updates.

Contributors:

  • Juan Cabotà
    Technology Innovation Lead
  • Luis Morales
    Technology Innovation Engineer
  • Sergio Pajares
    Chief Technology Officer

--

--