Efficient Tree-Shakable lodash-es with a Namespace
… and what I’ve learned debugging the issue of bundle sizing.
Background
Recently, our team want to migrate the JS transpiler in our Next.js project from Babel.js to SWC thanks to Next.js 12 has both built-in. Our babel.config.js
isn’t complex. The only blocking piese is babel-plugin-lodash
. It does the following transformation (on the transpilation level):
// Given src:
import _ from "lodash";_.map([a, b, c], _.toString);/**
* Output dist:
*/
import _map from "lodash/map";
import _toStirng from "lodash/toString";_map([a, b, c], _toString);
By doing so, we get rid of the monolithic lodash
object and only cherry-pick functions that are actually used, reducing the bundle size from the whole 73kB (24.5kB gzipped) to only a few bytes.
`modularizeImports` to the rescue
Obviously, the next step would be finding the alternative plugin in the SWC ecosystem to replace the usage of babel-plugin-lodash
. Thanks to the amazing talents in Vercel, Next.js ships the config modularizedImports
to SWC: https://nextjs.org/docs/advanced-features/compiler#modularize-imports
They even use lodash
as an example so I think: jackpot, easy job! Let me just remove our babel.config.js
and set experimental.modularizeImports = true;
for lodash
.
Needless to say, it didn’t go well or you wouldn’t seeing this article though. After the SWC migration, all pages are function correctly but the production bundle size increases about 25kB. A good guess would be the monolithic lodash
module were included. A further bundle analysis verified this assumption. The experimental.modularizeImports
didn’t work as expected! Why?
Pitfall A: Did we accidentally use _
as a value (instead of functions’ namespace)?
_
itself is a function that creates a Seq
instance. If someone are using it in this way, it pulls in nearly all methods of lodash
and makes the modularizeImports
config useless. Luckily, I’m certain that our codebase do NOT use the Seq interface NOR using _.chain
function. We’re only using _
as a functions’ namespace for easier identification. What went wrong that revoked modularizeImports
but with babel-plugin-lodash
actually worked in the past?
The next question I immediately asked myself was:
Did we accidentally use
_
as an JS identifier? Such as passing it as a parameter to a function call? Or using it as an initializer of an object property?
A quick full-text search around the codebase using _ =
, (_
, _)
and _,
suggested that only _,
is used. Where? As the placeholder parameter pass to the _.partial
function. eg.
const joinWithPlus = _.partial(_.join, _, " + ");
expect(joinWithPlus([1, 2, 3])).toEqual("1 + 2 + 3");
Obviously, we should have taken care of this when we migrated from the monolithic build of lodash
to lodash-es
. I then switched all these usages to _.partial.placeholder
according to the documentation. (FYI, here’s a full list of methods having the .placeholder
property: _.bind, _.bindKey, _.curry, _.curryRight, _.partial, _.partialRight
)
Did it solve the issue? No, the final bundle still pulled in the whole package of lodash-es
, modularizeImports
is still revoked. I then immediately opened the final bundled & minified JS with VSCode and tried to get my head around it.
Pitfall B: Implicitly use _ due to transpilation
I might have spoiled in the title already, but could you tell what’s going on here? Why the following code caused the full import of the whole package lodash-es
?
If you think this is nothing but a standard spread operator usage on a function invocation, you’re right and it definitely is. The tricky part lies within the transpilation step. Here’s the final bundled & minified JS:
Where 7835 is the ID of the index file of lodash-es
containing all functions. The module is then evaluated and stored in the h
variable. The _.join
function is pulled out from it and .apply
is called with the this
parameter being set as e
(same as the h
variable). Ah, this could be by definition of the spread operator on a function invocation trying to preserve the this
context when the desired function is called. The .apply
is used to handle variant parameters created by the spread operator.
But we don’t want the index module of lodash-es
being used as the this
argument. We only intended to use _
as the functions’ namespace. _.join()
should not be treated as invoking a member function named join on the object _
but it seemed to be. Why is this happening?
Modern JavaScript source codes take three steps to be transformed into the compiled bundles:
- Transpilation: transform modern JavaScript syntax into widely-supported syntax like ES2015. eg. SWC, Babel.js
- Bundling: resolve modules dependencies and group them together into giant bundle(s). eg. Webpack
- Minification: where dead-code elimiation/tree-shaking happening
Each steps take inputs and outputs are passed to the next steps. The above input is passed to the Transpilation step and the ouput pseudo-code (which is the input for the Bundling step) is:
As you can see, the _
is already being used as a function argument here.
Immediately passing this into into the Bundling step results in the following pseudo-code:
Look at the last line, the _
is still being used as the first argument of the apply
function, the this
argument. Since it’s still being referenced, it cannot be tree-shaken by the final Minification step.
The solution for this is simple: pull out the _.join
function from the namespace and store it in a local variable. Then, call the local variable as a fnction. Hence, the .apply
function will be transpiled to withnull
as the this
argument.
Potential Next Steps?
The first thing came up to me would be checking the TC39 spec for function invocations: https://tc39.es/ecma262/multipage/ordinary-and-exotic-objects-behaviours.html#sec-ecmascript-function-objects-call-thisargument-argumentslist, also check the this argument: https://tc39.es/ecma262/multipage/ordinary-and-exotic-objects-behaviours.html#sec-ordinarycallbindthis
Does invoking an imported function require it to have this
argument bound to module
object itself? This relates to how this
argument is defined when the Transpiler expands the array spread operator
to the .apply
function call. The night is late and I don’t have the answer yet.
The second thought would be writing an new ESLint rule. It should error when a array spread operator
is used on an imported function call. Unfortunately, people sometimes ignore ESLint errors/warnings and this might be error prone.
Last, could we build an analysis tool for the compiled bundles that runs on CI? Something like diffing for UI screenshots but for the compiled bundles and could look deep into the per-function/per-module in its dependencies.
Conclusion
What have I learned from this? Modern JavaScript (and yes, TypeScript as well) adds up complexities to the ecosystem, which I’m not sure how fast could we get rid of them. We’re not getting there soon, so the ability to dive into compiled bundles and looking for root causes is important. I hope this article does giving you some inspirations.
If you’re looking for some hands-on experiences to join your team, please reach out :)
I would love to offer my professional experiences to your team and help to optimize the product. My email is on the GitHub profile: https://github.com/tomchentw
Tom Chen is an open source contributor, a Toptal member and an Arc.dev member.