Caching Assets Long Term with Webpack

I recently converted a Gulp + Browserify build for a large production app over to Webpack. While there were many pitfalls and a few failed production deploys along the way, it all came together in the end. One of the biggest sources of frustration was getting long term asset caching to work.

Long term asset caching is accomplished by adding a content hash to the filename of each public asset. This hash should only change when the content of the file changes. This enables browsers to download the assets on the user’s first visit and cache them for future visits. The file(s) will only need to be downloaded again if filename, and thus the file content has changed.

The first, successful long term caching setup I arrived at involved a combination of chunk-manifest-webpack-plugin and webpack-md5-hash plugins. Here is great blog post on getting that to work. When upgrading to Webpack 2 beta, I was unable to get chunk-manifest-webpack-plugin to work due to some changes in the webpack api. Rather than use a fork of the plugin I decided to pursue other approaches and ended up finding a more straight forward way. I’ve created a repo on GitHub to demonstrate how to accomplish long term caching on top of a React app. The awesome create-react-app project was used to generate the base app. Generally, you’re only concerned with long term caching for production so this post will ignore the dev build and only focus on production.

Splitting the Bundle

A related optimization is splitting the Javascript bundle into separate vendor and application chunks. The vendor chunk will contain npm packages such as React, Angular, lodash, etc. The application chunk is just your app code. The idea is that vendor should change less frequently and in many cases is going to be larger than your app chunk. So if the vendor is separated out, returning users can use the cached version.

If we run npm run build, we get a single .js asset that includes a hash in the filename:

File sizes after gzip:

46.58 KB build/static/js/main.fa1ac7bc.js
289 B build/static/css/main.9a0fe4f1.css

This hash will change whenever the code for our app changes or if we install/uninstall packages from npm. The hash is the result of using [chunkhash] for filename and chunkFilename in the webpack config output:

output: {
path: paths.appBuild,
filename: ‘static/js/[name].[chunkhash:8].js’,
chunkFilename: ‘static/js/[name].[chunkhash:8].chunk.js’,
publicPath: publicPath
}

To split our bundle into separate vendor and app chunks, lets first create a vendor.js file that exports an array of vendor module names. The reason for this it just to be DRY. Since we’ll need to do this in both our development and production builds, having one file that is required by development and production configs is more maintainable.

// config/vendor.js
module.exports = [
'react',
'react-dom'
];

Next, add the CommonChunksPlugin to our webpack config:

new webpack.optimize.CommonsChunkPlugin({
names: ['vendor'],
minChunks: Infinity
})

Finally, change the entry point in our webpack config to be an object with vendor and app keys instead of an array:

var vendor = require ('./vendor');
...
entry: {
vendor // Use ES6 object literal property value shorthand
app: [path.join(paths.appSrc, 'index')]
}

Running npm run build will now generate separate app and vendor chunks:

File sizes after gzip:

46.28 KB build/static/js/vendor.f3034dda.js
944 B build/static/js/app.9fa6cb3f.js
289 B build/static/css/app.9a0fe4f1.css

Extracting Webpack Runtime

At first glance, this looks great. However, when we make a change to our app code we expect the app chunk’s hash to change but the vendor chuynk should stay the same. This is not the case. But why could that be? The answer lies inside the vendor chunk. If we peak into this file we’ll find the culprit:

{ 1: “app” }[e] || e) + “.” + { 1: “9fa6cb3f” }[e] + “.chunk.js

The vendor chunk is our entry point and therefore includes the webpack runtime, which maps chunk ids to chunk hashes. So whenever any module changes, the side affect is that entry point’s hash will change to0. This invalidates our cached vendor chunk every time we make a change to our app.

To solve this, we can extract the webpack runtime into a separate manifest chunk. Here is the one line change to our webpack config:

new webpack.optimize.CommonsChunkPlugin({
names: ['vendor', 'manifest'], // add manifest chunk
minChunks: Infinity
}),

Now the build output becomes:

File sizes after gzip:

45.8 KB build/static/js/vendor.c5e4b74b.js
945 B build/static/js/app.ba4d0d56.js
715 B build/static/js/manifest.26d5bd51.js
289 B build/static/css/app.9a0fe4f1.css

Let’s change some app code and rebuild and check the output:

File sizes after gzip:

45.8 KB build/static/js/vendor.c5e4b74b.js
948 B build/static/js/app.db92bf79.js
716 B build/static/js/manifest.e7800a99.js
289 B build/static/css/app.9a0fe4f1.css

Success! The app chunk’s hash changed and the vendor asset hash did not change. Also, our manifest asset changed, as you would expect. Since create-react-app uses html-webpack-plugin, the manifest automatically gets injected into our html. We now have long term caching working!

Inlining manifest.js

The manifest.js file is very small once minified and compressed, but it is still an additional network request the browser has to make. We can optimize further by inlining this small file in a script tag in our html. We can use inline-manifest-webpack-plugin to accomplish the inlining.

Add the plugin to our webpack config:

var InlineManifestWebpackPlugin = require('inline-manifest-webpack-plugin');
...
new InlineManifestWebpackPlugin({ name: 'webpackManifest' })

We’ll also need to rename create-react-app’s index.html template to index.ejs and add the following line to it:

<%= htmlWebpackPlugin.files.webpackManifest %>

Deterministic Builds

It would seem we’re in good shape for long term caching in production now. But there’s still one more problem. Webpack, by default, assigns modules integer ids, based on order. So when modules are changed, all ids could change, invalidating the cache. Luckily, there is a way to prevent ids from changing. We can configure webpack to choose ids in a deterministic way by using NamedModulesPlugin. This just takes one addition to the plugins array in webpack config:

new webpack.NamedModulesPlugin()

A couple of points to keep in mind with this plugin:

  • Because filenames are longer than ids, file sizes will be larger, although it’s very minimal after gzip.
  • File paths of modules are leaked, which could be a security issue.

HTTP Cache Headers

All the work configuring webpack for long term caching is useless without setting the proper HTTP Cache-Control header for our assets on the server side. Here is some more reading on HTTP caching patterns. Because we are using filenames that include a hash that changes whenever the file contents change, we can consider the assets to be immutable content. Therefore we can use a long max-age for Cache-Control. Here is an example where max-age of is set for 1 year:

Cache-Control: public,max-age=31536000

Final Thoughts

Webpack is an amazing and powerful tool! The concepts and configuration can seem complex at first. But learning the tool is well worth the time. Webpack has an awesome core team and docs are continuing to improve. Since introducing webpack at thredUP.com, we’ve been able to optimize our Javascript assets, not mention improving our build leading to a better developer experience. I’m looking forward to sharing some more tips we’ve learning along the way.