Long term caching using Webpack records

Dennis Johnson
4 min readSep 25, 2017

--

This article is an exploration prompted by a note in this article.

“Arguably there is an easy answer, you could use the RecordsPlugin (given the documentation on this is a little sparse).”

I was curious to see what that approach actually looked like. As the article states, there are many factors that go into getting consistent filenames. Using Webpack records helps generate longer lasting filenames (cacheable for a longer period of time) by reusing metadata, including module/chunk information, between successive builds. This means that as each build runs, modules won’t be re-ordered and moved to another chunk as often which leads to less cache busting.

There are three pieces needed to start using records:

  • output records.json file after a build
  • persist that file somewhere retrievable (shared filesystem, HTTP, etc.)
  • retrieve the file by branch before executing a build

The first step is achieved by a Webpack configuration setting:

recordsPath: path.resolve(__dirname, ‘./records.json’)

This configuration setting instructs Webpack to write out a file containing build metadata to a specified location after a build is completed.

Here is an example of what the records file looks like:

It keeps track of a variety of metadata including module and chunk ids which are useful to ensure modules do not move between chunks on successive builds when the content has not changed.

In the output of a Webpack build, there are statements like this amongst the code:

If the module ids required here change, the content of this built file changes even though the functionality hasn’t changed.

Once we have the records file, we need to persist it somewhere to be retrieved later on. This step can be achieved in many ways and depends on how your build is set up. In this particular example, we will be using Circle CI’s build artifacts feature — https://circleci.com/docs/2.0/artifacts.

Once we configure Circle CI to persist the records file output from a build, we can now start retrieving the most recent file for a particular branch by using the artifacts REST API.

Here is an example of configuring Circle CI to persist the records file:

To achieve the third step, we’ll need to write a script that queries the artifact API for the most recent artifact for that branch and place it in the location of the recordsPath configuration option we used in step one.

The following is a simple script that retrieves the most recent records artifact for the master branch via Circle CI’s REST API and writes it to the filesystem:

https://gist.github.com/songawee/f8d2618b28c293b7c8b61941630892ca

A note about setting up Circle CI API access…

You must create an API token before issuing requests to retrieve those artifacts. To create a token, navigate to the Circle CI project settings and then API Permissions. Here you can create a token. Make sure to configure the scope to be “Build Artifacts.” Once you have the token, you can make it available to your build script by setting it as an environment variable for that project. In this case, we’ll use CIRCLE_TOKEN as the environment variable name.

We also have access to other information such as the branch we’re on (CIRCLE_BRANCH) by Circle CI’s builtin environment variables.

Now we can update the Circle CI configuration to invoke this function before building.

With the configuration in place, we can now enjoy consistent file hashes across builds!

Example:

Here is a link to the repo I used for the following example.

Without Records:

In the following example, we are adding a dependency (superagent) to the vendor-two chunk.

We can see that all of the chunks change. This is due to the module ids changing. This is not ideal as it forces users to re-download content that has not changed.

The following example adds the same dependency, but uses Webpack records to keep module ids consistent across the builds. We can see that only the vendor-two chunk and the runtime changes. The runtime is expected to change because it has a map of all the chunk ids. Changing only these two files is ideal.

With Records:

Conclusion

Overall, I’m happy with the workflow of using records as well as the improved caching results. Using Webpack records was fairly easy to configure as it required only a one line configuration change. The setup for retrieving the existing records file could be complicated depending on what infrastructure you have. Circle CI provides all you need to get started which is kind of neat.

If I missed anything, let me know in the comments. Happy to update any corrections :)

--

--