Cache GraphQL POST requests with Service Worker

This article covers how to implement a service worker that caches POST requests with GraphQL APIs.

If you are new to Service Worker, please take a look at this introduction: https://developers.google.com/web/fundamentals/primers/service-workers/. If you are not sure what’s a GraphQL, here’s an excellent place to start: https://graphql.org/. As you continue to read, I assume you have a basic understanding of the technology.

TL;DR

  • GraphQL uses POST for all types of requests while REST utilizes GET, POST, PUT and DELETE for different types of actions.
  • Workbox, a high-level JS library for caching web assets using service worker, can’t cache POST requests due to the limits of Cache Storage API. Most GraphQL engines use POST requests with a single endpoint by default.
  • The solution is to implement a Service Worker to cache POST requests based on the query string in the request body and store responses in IndexedDB.

You can check out the code example at https://github.com/jonchenn/sw-graphql or the live demo.

A Progressive Web App (i.e., PWA), uses a Service Worker as a proxy program written in Javascript to fetch and cache resources like JS, CSS, HTML files, or even API responses. This little client-side Javascript program is powerful because we now have full control over what, when and how to cache resources. With Service Worker, you can build a web app with advanced features such as prefetching resources, fast rendering, offline browsing, and push notifications.

On the other hand, GraphQL is gaining popularity in recent years with its promising easy-to-use query interface, which aims to lessen the dependencies on backend API endpoints and accelerate FE development. In a GraphQL-based backend, it is common to see the setup like below:

  • It uses the same API endpoint for all types of requests (e.g., https://exa;kmples.com/graphql)
  • It only accepts POST requests with a query string in the request body

With this type of setup, we are seeing problems of caching GraphQL requests with service worker.

A recap for GraphQL vs. RESTful

REST is the leading standard for designing web APIs. It utilizes multiple types of HTTP verbs to handle different use cases: GET for fetching data, POST for creating new data, PUT for an update, and DELETE for data removal. A resource (e.g., users) is handled by one or more dedicated URLs, and each action is handled with a different verb. For example:

GET /api/users — Fetch all user profiles
GET /api/users/123  — Fetch a specific user profile
POST /api/users  — Create a new user profile
PUT /api/users/123  — Update a specific user’s profile

With the clear separation of each use case, each RESTful API endpoint needs a corresponding implementation to handle its underlying logic.

When it comes to complex queries, the concept of RESTful reaches its limitations. In some cases, this clear separation creates unnecessary multiple round trips of fetching resources. E.g., in a social network site, you need at least two API calls to fetch both a user’s profile and posts, because user profile and posts are served by separate API endpoints. To display news feeds like in Facebook or Twitter, it requires multiple round trips to collect everything it needs. This blog has good examples of RESTful API’s limitation.

On the other hand, GraphQL handles requests through a single API endpoint and returns corresponding resources based on query strings. For example:

POST /api/graphql — Handles all types of requests, including fetching all or specific users, creating user profiles, and fetching user profiles with corresponding users’ posts.

A query string defines what resources to fetch using conditions and limits, similar to the concept of SQL. In short, the most attractive feature of GraphQL is that Frontend developers can fetch whatever dataset they need with the same API endpoint with the same query language, and don’t need to worry about their underlying logic. For backend engineers, it means less maintenance of various API endpoints and codes. However, it also introduces the complexity of data binding logic in the backend.

Caching API responses

When we think about building a website with speed, a common approach is that we want to cache API response especially for resources that don’t change frequently. For example, to load a list of products faster for returning users, we want to cache the JSON response from API that returns the product list. The API endpoint may look like below in RESTful standard:

GET /api/products — returns a JSON of a list of products.

In this use case, we can implement a service worker using Workbox to cache the JSON based on the specific API endpoint, i.e., /api/products. You can find examples for Workbox implementation here.

However, when it comes with GraphQL, all requests are POST requests that point to a single API endpoint. With this constraint, we are seeing some problems with caching GraphQL responses.

Problems of caching GraphQL’s POST requests

  • Workbox is unable to cache POST requests: Workbox is a high-level JS library for caching web assets using service worker. It’s the recommended way of implementing Service Worker in most cases. As Workbox relies on Cache Storage API, it doesn’t allow caching POST request as stated here: https://w3c.github.io/ServiceWorker/
  • Workbox uses URLs as cache keys: Another problem is that Workbox caches resources based on URL due to the design of Cache Storage API. Since all GraphQL POST requests share the single API endpoint (e.g., https://examples.com/graphql), Service Worker won’t be able to tell which cached content to return purely based on the request URLs.

Solution 1: Persisted queries using GET requests

The easiest solution is to enable GET requests in the GraphQL backend. E.g., the API endpoint that takes POST requests:

POST /api/graphql

… now becomes

GET /api/graphql?query=a_long_query_string

You can check out the reference for more details: https://medium.com/@coreyclark/graphql-persisted-queries-using-get-requests-8a6704aba9eb

But not every GraphQL-based application fit with this solution. E.g., you might end up with “Request-URI Too Long” error with a long and complex query, like here: https://github.com/graphql/graphiql/issues/590.

Moreover, enabling GET requests in GraphQL requires non-trivial efforts of updating backend and API endpoints, such as implementing a new API endpoint, additional tests, maintenance on the new API endpoint, and configuring network routing and firewall policies. These tasks would easily become blockers for a Frontend development.

The second option is to build a service worker capable of caching JSON responses from POST API requests.

Solution 2: Cache serialized POST responses using query string as a key

In short, we want to implement a Service Worker that caches JSON responses of POST requests. Since Cache Storage doesn’t support caching POST requests, we use IndexedDB to store cached JSON. Here’s how it works in a high-level flow:

  • Service Worker intercepts a POST request and composes an MD5-hashed cache key based on the query string in the request body.
  • Service Worker checks the IndexedDB with the cache key. If the key exists, it returns the cached JSON.
  • If the actual cached content doesn’t exist, Service Worker sends the POST request to GraphQL endpoint and receives the response in JSON format.
  • Service Worker stores the new JSON response in IndexedDB using the cache key.

Implementation Example of Solution 2: NYC Michelin-starred Restaurant List

This sample web app is a showcase of caching GraphQL POST requests with Service Worker. You can check out the code at https://github.com/jonchenn/sw-graphql, or check the live demo here.

Now let’s start the web server by running the command:

npm start

After the Node.js server runs, open up http://localhost:4000, and you’ll see the index page like the screenshot below.

In the UI, you can search for restaurants by changing two filters: Michelin-stars and Restaurant Types. E.g., the list updates as below if you select “3” in the stars filter.

When you pick a new combination of filters that you haven’t chosen before, it takes roughly 3 seconds to update the list with a hardcoded delay at the backend level. If the filtering combination has been chosen before, the page refreshes the list instantly from the cache.

Now, let’s walk through the implementation details.

Implementation Details

This sample code contains several components:

  • index.html : the main page with UI to filter restaurants
  • sw.js : the Service Worker that caches POST requests in IndexedDB (using idb-keyval)
  • server.js : a GraphQL-based server in Node.js

Whenever the filters (stars and types) gets updated in the UI, the Javascript sends a POST request to the GraphQL backend with the query body like below:

The Service Worker (sw.js) intercepts all POST request with the following code snippet. For now, it only deals with POST requests and ignores all other types of requests for the simplicity of the demo.

The Service Worker uses the caching strategy called Stale While Revalidating, as you can find more details here. Here in the staleWhileRevalidate() function, it performs the following steps:

  1. Computes the cache key based on the query string in the POST body.
  2. Looks up IndexedDB for a cache response with the cache key. If the cached response exists, return it to request caller. (i.e., the Javascript in index.html)
  3. At the same time, sends the intercepted POST request without any modification and gets its response from GraphQL backend.
  4. Serializes the JSON response and updates it back to IndexedDB with the cache key.
  5. If (2) didn’t find a cached response, returns the latest response to the request caller. (i.e., the Javascript in index.html)

Service Worker generates a cache key based on the query string like below.

{ restaurants(type: null, stars: 3) { name type map stars } }

You can surely curate or simplify the key to whatever format you’d like. However, if the cache keys are overly simplified, it would end up with caching responses with the same keys while these responses are from different GraphQL queries.

Also, you can encrypt a query string into a hash to keep your cache keys short. For example, you can use Crypto-JS to encrypt the query string above into an MD5-based hash like below:

a2ce713cceb76af0ec3035cf2d99d8ae

Now, let’s take a look at what the Service Worker has stored in the IndexedDB. Let’s open up Chrome DevTools, and select IndexedDB > GraphQL-Cache > PostResponses.

As you can see, the PostResponses table contains a few rows of cache data. Each row contains a hashed query strings as key, API response, and a timestamp as value. You can manually delete these rows or clean up the entire table to reset the cache.

Custom caching handler in Workbox

Alternatively, you can still use Workbox with a custom handler, like below:

Nothing needs to change for the staleWhileRevalidate function and other parts in the Service Worker. You can check out the full code at https://github.com/jonchenn/sw-graphql/blob/master/public/sw.js

Handling cache expiration

In common caching strategies, we want to handle the expiry of the POST responses. Here, we use the HTTP header’s Cache-Control to set max-age for all requests. Since we store a timestamp for each POST response, we can quickly check whether a cached response expires like in the code below.

Gotchas / Caveats

For now, you have a GraphQL-friendly PWA. However, it is still far from perfect. Here are some drawbacks that we want to pay attention to.

Duplicated data with different keys might get out of sync

GraphQL’s query interface provides a great level of flexibility. It is possible to use different query strings to get the same result in different formats. For example, we may have two queries:

{ restaurants(type: null, stars: 3) { name type map stars } }
{ restaurants(type: null, stars: 3) { name type map} }

These two queries have the same query filters but with different requested fields. When the service worker updates one entry in IndexedDB, the other cache entry won’t get updated automatically. Hence, when the frontend JS fetches the restaurant list that is out of sync in IndexedDB, it ends up showing wrong data to users.

A quick suggestion for this type of staleness is to maintain a good set of unique queries and avoid storing duplicate data in IndexedDB. Moreover, an advanced solution is to replicate the same structure in IndexedDB that is used on the server-side to reduce the inconsistency and maintain only one source of truth. However, it needs a bit more effort on implementation.

Additional latency of hashing

For each cache key, we generate an MD5-based hash for quick lookup. This additional hashing introduces extra latency. In most cases with small query strings, this might not be a significant issue. However, when we hash a complex query string, the latency will soon become noticeable.

The suggestion is to make sure your query string is compact with necessary conditions and data fields, and maintain a reasonable query string size.

Conclusion

Service Worker is the most critical component in building a PWA. When it comes with GraphQL, we need a bit more effort in caching POST requests in Service Worker.

In short, here is a quick guide of what to use when implementing Service Worker with GraphQL:

  • Enable GET requests with the GraphQL backend if you can.
  • If POST requests are the only option in GraphQL, use IndexedDB to store cached response instead of the default Cache Storage API. Take care of the request parsing and response serialization as in this example
  • Alternatively, use a custom caching handler with Workbox’s routing.

You can check out the code at https://github.com/jonchenn/sw-graphql. Please feel free to file an issue to the Github for any questions.