What’s really wrong with node_modules and why this is your fault
I was never really concerned about the size of node_modules
— my thinking was that you should not care too much about the tools you need to do the job. If you need a 20 kg hammer to drive a nail, you just take it. The same story with the node_modules
, it may weight a few kilobytes or a few megabytes because our imaginary hammer comes with a set of heavy nails, right? Well, maybe, in theory.
Let’s drop that fancy analogy and look at real-world examples. I will examine some @angular/cli dependencies — but only because it’s quite a big library. I don’t want to make it look bad — it’s just a good representation of an average package. I installed it in the empty directory using npm@5.5.1. Npm reported “added 976 packages in 107.13s” after installation (that’s 141 megabytes on disk).
Okay so cli package it’s quite a robust library and its list of dependencies is a bit too long but perhaps all of them are needed. Let’s focus at the first selected package common-tags
. Quick look at its documentation and you can say that it’s some kind of utils library with a number of common methods to work with the text. So far so good — general methods, easy to reuse.
Just one little flaw — in deps of common-tags
we see babel-runtime
. A bit surprising, we just need some common text functions but — hey — it’s 2017 JS for you. Oh wait, it turns out that it wants core-js
and regenerator-runtime
. Fortunately it ends here and — what’s more — core-js
is also utils library, quite a big one honestly! It has so many functions inside I bet a lot of other packages will be using it!
Not really. Only babel-runtime
has it in its deps. Oopsie.
And returning to the starting point, cli uses only 3 (trivial) methods from common-tags
— stripIndents
, stripIndent
, oneLine
. Oopsie daisy.
In order to use these 3 methods node_modules
needs 1826 files. And that’s just 4 of mentioned 976 installed packages.
The next dependency is core-object
— it downloaded in total of 8 packages and 45 files — so not so bad. And other packages use these files too, mostly chalk
.
The real bummer is 6 of these 8 are dependencies of chalk
and chalk
is used only once in core-object
to paint yellow deprecation message.
Other random findings:
- few packages to tackle the topic of “querystring”
- some attempts for assert methods varying in complexity from
minimalistic-assert
toassert-plus
- dozens of various
is-*
packages - a lot of packages to “prettifyfifafiying-whatever” errors and console prints
- hundreds of polyfills/shims or reimplementation of native methods
- of course previous asserts sit in
node_modules
next to fulllodash
- … and some partial methods from
lodash
as separate deps
And, by random, I mean just picking some packages and brief searching for similar ones which was often really easy because the packages had similar names.
Let’s stop here. I bet other dependencies are necessary and well thought.
That is nothing new for JS devs, it has been like that for a while now and this situation shouldn’t be accepted— size of node_modules
is a topic of jokes and removal of packages like “left pad” is the cause of disasters.
So how can it be fixed? By creating the proper Standard Library.
Proper means that it should be complete, containing a variety of common functions to operate on text, numbers, collections and a lot of functions, so that during 99.99% of the time you won’t need any other library. Is that fantasy? I don’t think so. Taking as the base some of the mentioned utils libraries and merging it with others would be a great start.
There is only one problem and that’s not even a technical problem — creating such a package would need someone to take a position of the leader. I think rather about absolute power than democracy — if you want to know how democracy handles this problem look again at your node_modules. It needs a powerful leader because it needs a solid plan, not just months of discussions. We already have all packages implemented, we just need to glue it together in a logical way.
Right now, in pursuit of reusability and “keeping it DRY” typical node_modules
directory ended up being completely WET. Just because we thought that dozen of packages with overlapping functionalities were better than one well planned library.
Just a small note about jQuery — not so long ago, jQuery was in almost every project. Why? There were a number of solid reasons:
- It provides a set to commonly used functions in a convenient to use form. jQuery methods were easily chainable with each other (as result of being developed by one organization).
- It was widely known by everyone, so joining a project was easy, as there was no extra learning curve.
- Although some people complained about the size of jQuery it was mostly irrelevant as it was often loaded from CDN — so the for 90% of time it was already present on user’s computer.
The last point is very important — it really doesn’t matter how big some library is if you don’t need to download it. I think about lodash
— it really has tons of functions that can replace a lot of unnecessary dependencies. With its modular structure and lack of dependencies is a great library to choose if you need something that is missing in JavaScript.
Picking some smaller library only because of its size is the same level of evil that you do when with some premature optimization practices — ending up with no performance gain and the confusing code. Same here — smaller, unorganized packages leads to redundancy, incompatibility and overall much bigger node_modules
size.
At the beginning I said that I don’t care about the size of node_modules
and this is partly true — I don’t care so much about space it takes, but I do care about the number of files. In case of @angular/cli almost 70% of disk space is the 20 biggest packages (Pareto principle works everywhere!).
What’s more — if you take a look at WinDirStat’s report below you will see that a lot of packages contain big files (for example sourcemaps, these are green). And in terms of copying files computers works better in case of a few big files than thousands of tiny ones.
A large number of packages has one more drawback —the potential version incompatibility. Let’s assume each package has two versions 1.0 and 2.0. In the worst scenario some packages may need 1.0 and the others 2.0. The more packages are in our app, the more possible combinations you get. Resolving those combinations takes time, CPU and space to keep every occurrence of the old version needed by some equally old package.
Epilogue
This is the world we live in. The worst thing is that “it kinda works” so it’s not going to change anytime soon. Creating JS Standard Library would help but a real change is needed in developers mindset.
So if you can remember one thing from my looong article, let it be “use lodash”. And if you can, take also “use popular packages that already are used by others”. Programming is not an Individuality Contest so choose tested, war-seasoned libraries and do not increase JavaScript entropy.