Introducing @defer in Apollo Server

Optimizing time-to-interactive with incremental data loading

Clarence Ngoh

Published in

Apollo GraphQL

4 min readJul 26, 2018

Streaming a large GraphQL query incrementally in small parts

Optimizing data loading for data-driven applications

Many web applications today offer data-rich views that fetch data from many microservices and databases. To create a great user experience, developers must strive to minimize the time-to-interactive of each page, so that everything feels snappy and responsive despite expensive data requirements. One common optimization technique is to prioritize fetching data needed to render the most important content as fast as possible, and then loading the rest of the page in the background.

As an example, take a look at this GraphQL query for a NewsFeed page:

query NewsFeed {
  newsFeed {
    stories {
      text
      comments {
        text
      }
    }
    recommendedForYou {
      story {
        text
        comments {
          text
        }
      }
      matchScore 
    }
  }
}

Question: Does this page load as quickly as possible to give us a satisfactory time-to-interactive? If not, what is holding it back?

It is very common for the data we requested to have different latencies and cache characteristics. For example,stories is highly public data that we can cache in CDNs (fast), while recommendedForYou is personalized and might need to be computed for every user (slooow). Furthermore, we might not need comments to be displayed immediately, so slowing down our query to wait for them to be fetched is not the best idea.

Your page loads as fast as its slowest piece of data

A common but suboptimal solution to this data loading conundrum is to make multiple GraphQL requests to the server. The first loads minimal data to provide an initial render, before secondary requests get fired to load the rest of the page. This approach creates unnecessary performance and developer overhead:

Figuring out how to chop up your query into different pieces and coordinating those requests on the client.
Additional roundtrip cost of making separate GraphQL queries.
Sharing data between queries may be problematic. For example, you first have to fetch stories to get the storyId's, and pass that to the second query to fetch comments). This is painful if you have complex data dependencies.

This just doesn’t feel like the smooth GraphQL experience we want 😢.

Enter the @defer directive

⚠️ The @defer directive is an experimental Apollo feature. ⚠️
While we’re excited about this feature, it’s not implemented in a stable Apollo release. Interested users can review the respective Apollo Client or Apollo Server pull-requests, but those implementations will be re-worked — pending other changes— prior to their final release.

The @defer directive is intended to be a way for developers to mark parts of a query as being expensive. Instead of holding back the response, those fields would be resolved asynchronously and get streamed as patches to the client. This concept was first described in Lee Byron’s talk at GraphQL Europe 2016, and has generated lots of interest in the community since.

There are 3 scenarios where this is super useful:

Field is expensive to load. This includes private data that is not cached (like user progress), or information that requires more computation on the backend (like calculating price quotes on Airbnb).
Field is not on the critical path for interactivity. This includes the comments section of a story, or the number of claps received.
Field is expensive to send. Even if the field may resolve quickly (ready to send back), users might still choose to defer it if the cost of transport is too expensive.

Adding @defer to queries will help deliver a great user experience without all the boilerplate code needed for managing and combining multiple requests. To see its benefits in action, look at a visual comparison of the UX:

Improved user experience with experimental @defer support on Apollo

A future with @defer, powered by Apollo

Future versions of Apollo Server and Client aim to provide this as a first-class feature, allowing fields marked with@defer directives to be automatically deferred, with their eventual fulfillment streamed to the client.

Let’s consider what the the above NewsFeed query would look like with @defer:

query NewsFeed {
  newsFeed {
    stories {
      text
      comments @defer {
        text
      }
    }
    recommendedForYou @defer {
      story {
        text
        comments @defer {
          text
        }
      }
      matchScore 
    }
  }
}

This would be sent as a single query, but the execution phase of Apollo Server will know not to wait for deferred fields before sending the initial response:

// Initial response
{
  "data": {
    "newsFeed": {
      "stories": [{"text": ..., "comments": null}],
      "recommendedForYou": null
    }
  } 
}

As the rest of the deferred fields get resolved, they would be sent as patches to Apollo Client, causing the UI to update automatically.

// Patch for "recommendedForYou"
{
  "path": ["newsFeed", "recommendedForYou"],
  "data": [
    {
      "story": {
        "text": ...
      }, 
      "matchScore": ...
    }, 
    {...}
  ]
}// Patch for "comments", sent for each story
{
  "path": ["newsFeed", "stories", 1, "comments"],
  "data": [{
    "text": ...
  }]
}

Let us know your thoughts!

@defer exemplifies the many benefits of taking a declarative approach to data loading, allowing future development to adopt valuable patterns for building a world-class user experience — with no minimal extra work.

The support for @defer is not production ready today, but we welcome feedback on the Apollo Spectrum. The CodeSandbox embedded below is a great starting point to experiment with the NewsFeed example you’ve seen earlier.