webpack 4: Code Splitting, chunk graph and the splitChunks optimization

webpack 4 made some major improvements to the chunk graph and added a new optimization for chunk splitting (which is a kind of improvement over the CommonsChunkPlugin).

Let’s take a look at some of the drawbacks of the old chunk graph.

In the old graph chunks were connected to other chunks via parent-child-relationship and chunks contain modules.

When a chunk has parents it can be assumed that at least one parent is already loaded when the chunk is loaded. This information can be used by optimization steps. I. e. when a module of the chunk is available in all parents, it can be removed from the chunk, because it’s already available in any case.

At an entrypoint or an async splitpoint a list of chunk is referenced. These chunks are loaded in parallel.

These kind of graph makes it difficult to express “splitting” of chunks. For example this happens when using the CommonsChunkPlugin. Here from one or multiple chunks modules are removed and put into a new chunk. This new chunk need to be connected into the chunk graph. But how? As parent of the old chunk(s)? As child? The CommonsChunkPlugin added it as parent, but that’s technically wrong and affects other optimization negatively (parent information is inexact).

The new chunk graph introduces a new object: the ChunkGroup. A ChunkGroup contains Chunks.

At an entrypoint or an async splitpoint a single ChunkGroup is referenced, which means all contained Chunks in parallel. A Chunk can be referenced in multiple ChunkGroups.

There are no longer parent-child relationships between Chunk, instead this relationship now exists between ChunkGroups.

Now “splitting” of Chunks can be expressed. The new Chunk is added to all ChunkGroups which contain the origin Chunk. This doesn’t affect parent relationships negatively.


Now with this problem fixed, we can start using chunk splitting more. We can split any Chunk without risking to break the chunk graph.

The CommonsChunkPlugin has a lot of problems:

  • It can result in more code being downloaded than needed.
  • It’s inefficient on async chunks.
  • It’s difficult to use.
  • The implementation is difficult to understand.

So a new plugin was born: SplitChunksPlugin.

It automatically identifies modules which should be split of chunks by heuristics using module duplication count and module category (i. e. node_modules). And splits the chunks…

There is a paradigm shift here. The CommonsChunkPlugin was like: “Create this chunk and move all modules matching minChunks into the new chunk”. The SplitChunksPlugin is like: “Here are the heuristics, make sure you fullfil them”. (imperative vs declarative)

The SplitChunksPlugin also has some great properties:

  • It never downloads unneeded module (as long you don’t enforce chunk merging via name)
  • It works efficiently on async chunks too
  • It’s on by default for async chunks
  • It handles vendor splitting with multiple vendor chunks
  • It’s easier to use
  • It doesn’t rely on chunk graph hacks
  • Mostly automatic

Here are few examples what the SplitChunksPlugin would do for you. These examples only show the default behavior. There are more possibilities with additional configuration.

Note: You can configure it via optimization.splitChunks. The examples say something about chunks, by default it only works for async chunks, but with optimization.splitChunks.chunks: "all" the same would be true for initial chunks.

Note: We assume every external library used here is bigger than 30kb, because the optimization only kicks in after that threshold.

Vendors

chunk-a: react, react-dom, some components

chunk-b: react, react-dom, some other components

chunk-c: angular, some components

chunk-d: angular, some other components

webpack would automatically create two vendors chunks, with the following result:

vendors~chunk-a~chunk-b: react, react-dom

vendors~chunk-c~chunk-d: angular

chunk-a to chunk-d: Only the components

Vendors overlapping

chunk-a: react, react-dom, some components

chunk-b: react, react-dom, lodash, some other components

chunk-c: react, react-dom, lodash, some components

webpack would also create two vendors chunks, with the following result:

vendors~chunk-a~chunk-b~chunk-c: react, react-dom

vendors~chunk-b~chunk-c: lodash

chunk-a to chunk-c: Only the components

Shared modules

chunk-a: vue, some components, some shared components

chunk-b: vue, some other components, some shared components

chunk-c: vue, some more components, some shared components

Assuming the size of the shared components is bigger than 30kb, webpack would create a vendors chunk and a commons chunk, with the following result:

vendors~chunk-a~chunk-b~chunk-c: vue

commons~chunk-a~chunk-b~chunk-c: some shared components

chunk-a to chunk-c: Only the components

When the size of the shared components is smaller than 30kb, webpack intentionally duplicates the modules in chunk-a to chunk-c. We think reduces download size is not worth the extra request needed for a separate chunk load.

Multiple shared modules

chunk-a: react, react-dom, some components, some shared react components

chunk-b: react, react-dom, angular, some other components

chunk-c: react, react-dom, angular, some components, some shared react components, some shared angular components

chunk-d: angular, some other components, some shared angular components

webpack would create two vendors chunks and two commons chunks

vendors~chunk-a~chunk-b~chunk-c: react, react-dom

vendors~chunk-b~chunk-c~chunk-d: angular

commons~chunk-a~chunk-c: some shared react components

commons~chunk-c~chunk-d: some shared angular components

chunk-a to chunk-d: Only the components


Note: Since the chunk name includes all origin chunk names it’s recommended for production builds with long term caching to NOT include [name] in the filenames, or switch off name generation via optimization.splitChunks.name: false. Elsewise files will invalidate i. e. when more chunks with the same vendors are added.