Vendor and code splitting in webpack 2

Webpack is an ambitious, powerful tool for bundling modern web applications. Unfortunately, its complexity can make it daunting to learn. While the team has made incredible strides improving the docs, there are still a few places which remain counter-intuitive. In this post I’ll introduce and discuss those features which I found most difficult to learn, namely code splitting, and managing bundle sizes and contents. Then I’ll finish with a smattering of webpack tips, and solutions to some of the things which tripped me up when first getting started.

This post will only superficially cover things like minification, transpilation, the webpack development server, etc. These areas have been covered in depth elsewhere, in particular the fantastic SurviveJS book.

Bundling and code splitting

Preliminary setup

The site listens for changes to the URL hash, and loads the part of the application the user is on; for example, the user might click from the books-list section, to the section where she can enter new books, etc. As the user browses to these different sections of the application, the required code is loaded via System.import. It looks like this

Note that System.import was recently changed to just import, but since much of the tooling hasn’t caught up, and since webpack still supports either, I’m sticking with the deprecated form for now

When weback builds the application, it’ll search out all calls to System.import, and create split points for each; each created split point will only be loaded when the corresponding System.import is executed in code. In addition to these split points, webpack will of course also start at the application’s entry point, and walk all its dependencies, creating a single, bundled entry point. This will be the primary code for your application, which will need to be loaded with a regular script tag in the main htm page.

Before I start with the webpack code, note that I won’t go over every path to every npm utility I use. Check out the package.json file in the react-redux directory to see everything that’s needed.

Basic build

Let’s break that down:

  • entry — this is the entry point of the application. It’s where the application starts. It’s what pulls everything else in, and gets the application doing things. For me, that’s reactStartup.js.
  • output — this tells webpack where to create the resulting bundle. Again, you’ll need to create a script tag in your site to load this script.
  • resolve — this is some basic housekeeping. My simple-react-bootstrap has a main field in its package.json with un-transpiled files with .es6 extensions. I should probably fix this, since it confuses some tooling, but for now I’ll alias that away. And I’m also aliasing an old — not-on-npm — JavaScript color picker called jscolor.
  • resolve.modules — this tells webpack where to go searching to resolve import statements it finds while parsing the application. node_modules is of course for npm utilities, but more interesting is that I needed to add ./. This allows webpack to find modules with an absolute path. For example, rather than link to ../../applicationRoot/components/button I just link to applicationRoot/components/button. Unfortunately I was unable to find a single configuration item that would simply tell webpack “this is the base for all absolute paths.”
  • module.loaders — this sets up my Babel transpilation.

What does the output look like?

We can see bundle.js at the very bottom, and the files 0–5.bundle.js representing the 6 split points I added before. Ignoring for the moment that the main bundle is huge (remember there’s no minification), it’s not really clear what’s in our bundles.

Improving the output

(note the new require at the top, and the new plugins section at the bottom)

Now, just like that, after we run webpack, a new browser tab should open showing this

This is a fully interactive data visualization of our build. It shows the bundles in order of size, with our massive main bundle up top, followed by our async chunks 1, 0, 2, and then 3–5 are sized too small to see. The visualization lets you zoom in as close as you want to see every individual bundle anywhere, and displays info for each, such as size, gzipped size, path, etc.

Breaking up our main bundle

Which turns the build into this

So now the node-static bundle is the largest, by far. Admittedly this is because we’re running an un-minified version of React. Things would look much better with minification, but since this isn’t preventing us from seeing how webpack works, I’ll just press on.

Why are there still npm items in the code split bundles?

CommonsChunk will pick up either chunks from the initial bundle, or chunks from the code split bundles, but never both. This makes perfect sense when you think about it: 0.bundle.js uses things like react-dnd. There’s absolutely no reason to have this library loaded initially, before that code split bundle is even loaded; in fact, the user may never choose to load this part of the application, so preloading it could turn out to be a complete waste.

So how DO we use CommonsChunk with code split modules?

The code below creates an async commons chunk with react-dnd, and its helpers.

Which produces

Excellent. The react-dnd chunk was created with what we wanted. Notice how I chose to build the react-dnd build based on specific npm utilities. Lastly, I’ll add a catch-all async chunk, which bundles everything that’s used in two different code-split bundles by simply checking the count argument passed to minChunks.

Which produces

Which works as expected. Our react-dnd bundle has what we asked for, and the used-twice bundle has everything that would otherwise sit in two separate code split bundles.

What’s especially interesting here is that these same results are obtained no matter the order the async CommonsChunk plugins are listed. Specifically, if a module is used in two locations, and one CommonsChunk instance grabs it via context-path, other CommonsChunks will be smart enough to, conceptually at least, treat that module as having count 1, no matter the order these CommonsChunks are listed. What’s even more interesting is that if you manually create two async CommonsChunks containing the same module, which you pull in by path, webpack will automatically de-dupe it, and leave that module in the last CommonsChunk that asked for it.

This means, as far as I can tell, that webpack runs the list of these CommonsChunks in order, resets the count to 1 on each module it adds, and yanks qualifying modules from any prior CommonsChunk that currently has it.

Refining some things

So each static bundle will have a file name of [name]-bundle.js where name is the name we provide in CommonsChunk. Async splits will be given the name [name]-chunk.js. Unfortunately, for these async chunks the name will be auto-generated numbers, so you’ll have 0-chunk.js, etc. There are currently open cases to improve on this.

Also note the publicPath property. This tells webpack what path to look for async chunks in, at runtime. For example, the 0-chunk.js chunk will be requested from react-redux/dist/0-chunk.js.

Splitting out the webpack runtime

It looks like this

new webpack.optimize.CommonsChunkPlugin({ name: 'manifest' }),

Be sure to actually load your static build files

Where to, from here?

As this application grows, I’ll periodically just run the build and analyze the bundles visualization. As my static node_modules bundles grows (RxJS, D3, react-router, who knows) I’ll likely break it apart further using the same approach as above. The same goes with the async chunks, particularly the catch-all used-twice chunk. I imagine more and more things will wind up in there until it, too, needs to be split up further. Webpack, and the nice visualization plugin make this simple.

Odds and ends

babelHelpers is undefined

webpack-dev-server

Be sure to set up proxies for it in your webpack config. The dev server will serve your (webpack-created) js assets from port 8080 by default, but you’ll still want all your ajax requests, requests for css files, etc., to be processed as normal. If your “real” application is running on, say, port 3000, just proxy the relevant requests like this.

Bundling for production

For some reason, on Windows 10 at least, process.env.NODE_ENV was always undefined in my webpack.config file, so I needed to give it a nudge. My Windows-based production npm build script looks like this

"react-redux-build": "cd react-redux && rm -rf dist && set NODE_ENV='production' && webpack -p"

Why did I want to read process.env.NODE_ENV in my webpack config file? For now it was just to prevent the bundle visualization from running. How did I do that? …

Remember, the webpack.config file is “just JavaScript”

I think the define plugin may automate this, but for a simple one-off, don’t be afraid to just use some simple JavaScript. Besides, in the end, my check wound up being more complex than this: I don’t want the visualization if NODE_ENV is true, or if there’s a -p command line argument, or if I’m running from webpack-dev-server. I don’t know if there are native webpack configuration tools that can handle all this, but even if so, for me, a few lines of JavaScript are simpler and cleaner.

This advice also applies to the rest of the config file. If you have repetitive bits of configuration — webpack-dev-server proxy entries, CommonsChunk plugins, etc — there’s nothing stopping you from creating helper functions which generate these bits of configuration, and calling them inline, just as you’d refactor any other bit of repetitive code. It’s just JavaScript.

Final Webpack Config

Just from above, some changes I’ve already made were to add stage 1 and above transpilation, add a separate babel entry just for my simple-react-bootstrap, so I could integrate the raw ES6 code from the project, rather than the bloated transpiled version (which included all the babel helpers discussed above), and I’ve refined the static node build to just have react in it.

Conclusions

That build was my best attempt at creating bundles for all split points, while keeping shared utilities in their own, on-demand bundles. How did all this manual work compare to webpack?

The 30–40K figure admittedly includes the 18K for systemjs itself (which is no longer needed).

Still though, without webpack I was pushing down 18K for a script loader just for the privilege of pushing down an extra 12–22K of code to my users.

What was that 12–22K? Part of it was the more efficient builds that webpack creates. Because of how SystemJS works, it would create bundles that looked like this

System.registerDynamic('full/path.js',['full/path/A','full/path/B']

while webpack would create something more like

webpack=webpackJsonp([4,10],

With lots of small modules that can add up. Also, my hand crafted attempt to pull out shared code into on-demand utilities was terribly naive, and resulted in far too much code being prematurely sent down.

However frustrating webpack can be, I promise you the alternatives are worse.

Did I miss anything?

If you have questions about how something works, or why your code won’t work, you’ll be far better off asking on Stack Overflow. I’m still a beginner at webpack, so I likely won’t even know the answer, to say nothing of how much faster you’ll get a response there.

Happy coding!