Efficient Tree-Shakable lodash-es with a Namespace

Lost in Space
@tomchentw/software
6 min readJul 20, 2022

… and what I’ve learned debugging the issue of bundle sizing.

Background

Recently, our team want to migrate the JS transpiler in our Next.js project from Babel.js to SWC thanks to Next.js 12 has both built-in. Our babel.config.js isn’t complex. The only blocking piese is babel-plugin-lodash. It does the following transformation (on the transpilation level):

// Given src:
import _ from "lodash";
_.map([a, b, c], _.toString);/**
* Output dist:
*/
import _map from "lodash/map";
import _toStirng from "lodash/toString";
_map([a, b, c], _toString);

By doing so, we get rid of the monolithic lodash object and only cherry-pick functions that are actually used, reducing the bundle size from the whole 73kB (24.5kB gzipped) to only a few bytes.

`modularizeImports` to the rescue

Obviously, the next step would be finding the alternative plugin in the SWC ecosystem to replace the usage of babel-plugin-lodash. Thanks to the amazing talents in Vercel, Next.js ships the config modularizedImports to SWC: https://nextjs.org/docs/advanced-features/compiler#modularize-imports
They even use lodash as an example so I think: jackpot, easy job! Let me just remove our babel.config.js and set experimental.modularizeImports = true; for lodash.

Needless to say, it didn’t go well or you wouldn’t seeing this article though. After the SWC migration, all pages are function correctly but the production bundle size increases about 25kB. A good guess would be the monolithic lodash module were included. A further bundle analysis verified this assumption. The experimental.modularizeImports didn’t work as expected! Why?

Pitfall A: Did we accidentally use _ as a value (instead of functions’ namespace)?

_ itself is a function that creates a Seq instance. If someone are using it in this way, it pulls in nearly all methods of lodash and makes the modularizeImports config useless. Luckily, I’m certain that our codebase do NOT use the Seq interface NOR using _.chain function. We’re only using _ as a functions’ namespace for easier identification. What went wrong that revoked modularizeImports but with babel-plugin-lodash actually worked in the past?

The next question I immediately asked myself was:

Did we accidentally use _ as an JS identifier? Such as passing it as a parameter to a function call? Or using it as an initializer of an object property?

A quick full-text search around the codebase using _ = , (_ , _) and _, suggested that only _, is used. Where? As the placeholder parameter pass to the _.partial function. eg.

const joinWithPlus = _.partial(_.join, _, " + ");
expect(joinWithPlus([1, 2, 3])).toEqual("1 + 2 + 3");

Obviously, we should have taken care of this when we migrated from the monolithic build of lodash to lodash-es . I then switched all these usages to _.partial.placeholder according to the documentation. (FYI, here’s a full list of methods having the .placeholder property: _.bind, _.bindKey, _.curry, _.curryRight, _.partial, _.partialRight)

Did it solve the issue? No, the final bundle still pulled in the whole package of lodash-es , modularizeImports is still revoked. I then immediately opened the final bundled & minified JS with VSCode and tried to get my head around it.

Pitfall B: Implicitly use _ due to transpilation

I might have spoiled in the title already, but could you tell what’s going on here? Why the following code caused the full import of the whole package lodash-es ?

Could you tell what’s going on here?

If you think this is nothing but a standard spread operator usage on a function invocation, you’re right and it definitely is. The tricky part lies within the transpilation step. Here’s the final bundled & minified JS:

Where 7835 is the ID of the index file of lodash-es containing all functions. The module is then evaluated and stored in the h variable. The _.join function is pulled out from it and .apply is called with the this parameter being set as e (same as the h variable). Ah, this could be by definition of the spread operator on a function invocation trying to preserve the this context when the desired function is called. The .apply is used to handle variant parameters created by the spread operator.

But we don’t want the index module of lodash-es being used as the this argument. We only intended to use _ as the functions’ namespace. _.join() should not be treated as invoking a member function named join on the object _ but it seemed to be. Why is this happening?

Modern JavaScript source codes take three steps to be transformed into the compiled bundles:

  1. Transpilation: transform modern JavaScript syntax into widely-supported syntax like ES2015. eg. SWC, Babel.js
  2. Bundling: resolve modules dependencies and group them together into giant bundle(s). eg. Webpack
  3. Minification: where dead-code elimiation/tree-shaking happening

Each steps take inputs and outputs are passed to the next steps. The above input is passed to the Transpilation step and the ouput pseudo-code (which is the input for the Bundling step) is:

As you can see, the _ is already being used as a function argument here.

Immediately passing this into into the Bundling step results in the following pseudo-code:

Look at the last line, the _ is still being used as the first argument of the apply function, the this argument. Since it’s still being referenced, it cannot be tree-shaken by the final Minification step.

The solution for this is simple: pull out the _.join function from the namespace and store it in a local variable. Then, call the local variable as a fnction. Hence, the .apply function will be transpiled to withnull as the this argument.

Potential Next Steps?

The first thing came up to me would be checking the TC39 spec for function invocations: https://tc39.es/ecma262/multipage/ordinary-and-exotic-objects-behaviours.html#sec-ecmascript-function-objects-call-thisargument-argumentslist, also check the this argument: https://tc39.es/ecma262/multipage/ordinary-and-exotic-objects-behaviours.html#sec-ordinarycallbindthis

Does invoking an imported function require it to have this argument bound to module object itself? This relates to how this argument is defined when the Transpiler expands the array spread operator to the .apply function call. The night is late and I don’t have the answer yet.

The second thought would be writing an new ESLint rule. It should error when a array spread operator is used on an imported function call. Unfortunately, people sometimes ignore ESLint errors/warnings and this might be error prone.

Last, could we build an analysis tool for the compiled bundles that runs on CI? Something like diffing for UI screenshots but for the compiled bundles and could look deep into the per-function/per-module in its dependencies.

Conclusion

What have I learned from this? Modern JavaScript (and yes, TypeScript as well) adds up complexities to the ecosystem, which I’m not sure how fast could we get rid of them. We’re not getting there soon, so the ability to dive into compiled bundles and looking for root causes is important. I hope this article does giving you some inspirations.

If you’re looking for some hands-on experiences to join your team, please reach out :)

I would love to offer my professional experiences to your team and help to optimize the product. My email is on the GitHub profile: https://github.com/tomchentw

--

--

Lost in Space
@tomchentw/software

<Tom Chen> Aspie. Introvert. Remoter. Blogger. 「從程式碼的26個英文字母到文章的26個英文字母,開始發現寫作的魅力。」