Predictable long term caching with Webpack
Getting long-term caching right with Webpack is a problem that never really got a final answer.
There is, however, this open issue on Github:
- it has 162 comments
- is soon to have its 2 birthday
- a lot of suggestions that often make things worse
- and probably quite some google hits.
Arguably there is an easy answer, you could use the RecordsPlugin
(given the documentation on this is a little sparse. But that requires you to keep track of every build you have. I myself like to not rely on state and therefore will try to give the quest for a good answer a try.
tl;dr;
- Use NamedModuleIds
- Use NamedChunkIds
- Sprinkle a little bit of magic
- and then a bit more
But First things first. What stops a default unoptimised Webpack build from being long term cacheable? It’s a long story. Lets take it.
We will set up a small app with Webpack, let it grow as time goes by and thereby run into all kinds of problems. And as with any good quest, we will slay them all, or something like that...
The basics
This is our initial unoptimised Webpack config:
and this is what we find in foo.js
Building this gives us a Webpack output along these lines:
so far so good.
Vendor chunks
The first thing we may want to do is pull preact
out of our main entry file, as it will hopefully change less often as the rest of our app, therefore we add a the CommonsChunkPlugin
to our Webpack config.
We build out app again and we get an output similar to that:
Don’t fall asleep. Bear with the fun is soon about to begin.
The correct hash
You may have noticed, we encountered our first issue here. The main and the vendor chunk are the same. Any change to the main file would now also invalidate our vendor chunk.
To fix this we must switch from hash
to chunkhash
in our filename. This is, because hash
generates some sort of global hash for all the assets we built, while chunkhash
will only hash whatever is in its chunk.
Running our build again we now see two different hashes.
Very nice.
You have runtime issues
Changing anything in the main chunk should now keep vendor
chunk untouched. Let’s add a new bogus line to it:
Running the build again, however, shows us that everything is lost, yet again:
But why? The problem here is in the detail:
As Webpack behaves extracting preact
out of the main
-chunk also extracts the runtime of Webpack into it. The runtime is the part of Webpack that resolves modules at runtime and handles async loading and more. Looking into it, we see a reference to our main chunk in it:
Luckily we can fix this. If you add a CommonsChunkPlugin
with the name of a chunk that does not exist as the name of an entry-point Webpack will extract the runtime, create a chunk by that name and put the runtime in there. Sounds magic? Well yeah, I guess?
Swinging our magic wand and running our build again now yields this:
Changing something in the main chunk now will only change the runtime and the main chunk, the vendor chunk will remain untouched.
Adding more dependencies
However, the story does not end. As our project grows we add more dependencies:
We run our build once again expecting only our build we only expect the main and vendor chunk to change. But you guessed it, that’s not what happens:
Even though nothing in the vendor chunk changed, its hash changed yet again. The reason is once again a detail. Every chunk gets a numerical chunk id. These are given in order, and as the order can change with every new import added, the chunks ids may change with them.
Name your chunk
Enter NamedChunksPlugin
. This is a recent add to the Webpack source (2.4) and allows to have names rather than numbers for our chunks:
This will use the unique
chunk name instead of its id to identify a chunk.
We can run our build again with and without the addition of bar.js
and see the vendor chunk hash stays the same. Well, except it doesn't. Looking into the two vendor chunks we see something like this:
Name your modules — sorry no pun here :(
For some reason, Webpack adds these ids of all the modules that exist to our vendor chunk. Let’s not care too much about the why. Lucky enough there is an easy solution. Enter theNamedModulesPlugin
.
It does very much the same as the named chunk equivalent. Instead of using numerical ids it uses a unique path to map our request to a module.
Thanks to this change the vendor hash will now always stay the same:
Is that it? Please tell me that’s it, its boring as hell and everything changes all the time.
Well, guess what? haha. Nope.
Love me some async
As our app grows it gets heavy. To prevent loading all the code at once we break it up using some async split points. First, we add one:
and shortly thereafter another async dependency:
WTF WEBPACK. Why is my async-chunk suddenly named differently? And why do they have numbered ids again? I thought cache invalidation was hard, but you just invalidate everything all the time :(.
Well turns out that the NamedChunkPlugin
only handles chunks that have a name. That is not the case for our async chunks. Stupid lazy OSS devs, pff.
Let’s fix this.
The NamedChunksPlugin
accepts one parameter. This parameter has to be a function that receives the chunk as its own parameter and must return an id for it. We change our plugin to something like this:
Running our build again, we can now add as many async chunks as we want. Previously added ones will keep their name and hash untouched:
Side note: Feel free to change the aesthetics of these async chunk names to whatever. Me too lazy.
Okeeeeh is that it now?
Well how about NO
Legacy before it was hipster — external dependencies
There is this once thing that won’t die and we have to support it. Also for some reason, we still use it somewhere in our app. And as we don’t want to load it twice we want to get it from the global context. But being good developers we also want to make the dependency explicit. So we define our jQuery as an external dependency.
What could possibly go wrong? Well, everything.
err yeah, thanks jQuery
for killing my vendor chunk? What are you... WHAT?
Well turns out just as for the chunks the NamedModulesPlugin
also only works for normal modules. That being said, the external module stole the id 0
from the multi preact
module. Let’s fix this once and for all.
Give everyone a name
Besides the normal modules, there are a bunch of other modules Webpack uses. Not all of them are covered by the NormalModulesPlugin
however. Therefore we need to roll our own (Maybe I should make this a package?*):
We add this to our Webpack config:
This works pretty much like the NormalModulesPlugin
itself, except it uses the module#identifier
method for all modules that do not have an id
at this point.
Running our build once more we do get a new vendor cache but this once is persistent no matter how many external modules you may add!
As you can see the id of the multi preact
switched from [0]
to [multi preact]
.
So that’s it? No. One last step!
Adding more entry points
The app grows further and we have a second entry point. So we separate this out into a new entry point and add it to our Webpack config:
Its content is something like this:
Running the build makes us (╯°□°)╯︵ ┻━┻.
The build now yields this:
Everything changed. The vendor, the main file, everything, but why?
Turns out that our vendor chunk does not only take what we specify but also everything we use in both of our entry-points. While this sometimes might be desired to share our common used own rolled modules, these should not end up in the vendor chunk. Maybe like now, we don’t even want this at all. Adding minChunks: Infinity
to our vendor commons chunk tells Webpack we really only want what we specified in the entry:
And if we run our build again:
We have exactly what we want. Our final Webpack config looks like this:
That’s it. At least as far as I was able to think of edge cases. I do hope this helps. If you still find quirks, feel free to comment and I will try to find a solution for them.
Now go out into the world and enjoy. And stop using weird hashing plugins that do more harm than good. Please.
One last important note: You may now actually have to take care of cache invalidation. As Webpack may allow plugins to change assets after the chunkhash
is calculated, and those plugins may not properly update the chunkhash
, this could cause an asset to keep its hash even though it was actually changed. I personally haven’t hit that, but its always good to be careful.
That’s all for now, have fun! :)