The truth about package-lock.json
After 14 years of Node.js, the meme that shows how the node_modules
directory is the heaviest object in the universe is a hallmark. But jokes aside — do not miss the subtext:
The lion share of your shippable software is code you did not write.
Your software stands on the shoulders of giants and, from time to time, these giants need care.
What care do Packages need?
Consider the causes that bring software authors to release a new version:
- Be competitive — do more, add new features
- Do better— fix bugs, adhere to spec
- Stay safe — close recently discovered security issues
Bug or feature — both are things known during development: It’s clear what to verify before publication.
However, the last one means that a new security issue might be discovered after publication; you could not have checked it during coding and testing, even if you wanted.
The solution depends only on your ability to respond with a new version.
Dependencies you use are also software, whose authors may need to release new versions for the same 3 reasons. Maybe the first is not relevant, but the latter two should be alarming enough.
Let’s dig deeper.
Dependencies in Node.js projects
When you add a dependency to a Node.js project, your package manager installs it and updates it in your package.json
.
e.g. in this script:
3rd line yields:
"dependencies": {
"pino": "^8.7.0"
}
Note the caret (^
) — this is not a concrete version, this is a version policy, expressed in SemVer.
I’ll give the highlights of SemVer down this post, but for now, just acknowledge that this version policy means: any version that is greater or equal to 8.7.0
but is smaller than 9.0.0
— i.e. a range.
White Hat / Black Hat
Consider this:
Exploits are use of vulnerabilities, like XSS and code-injection.
Black hats are malicious hackers that seek exploits to use them.
White hats are protective hackers that seek exploits to close them.
On top of that, exploits are getting more and more sophisticated. Hackers learnt to gather minor vulnerabilities that while each one seems harmless, when found together — enable a way to execute a breach.
The vicious cycle is: a version is released to the world, hackers find its exploits, security fixes are worked out — which brings us back to a new version, and so on.
The often-missed obvious fact is:
Security fixes come only in new versions.
Somebody moved my cheese!
Before package-lock.json
, installing dependencies a second time using the same version-policies might not yield the same result deterministically.
Meaning, you developed and tested your code with the latest & greatest dependencies in the time you installed them, and it all worked great. Meanwhile a new dependency version which you did not test with your code was released — and that, sometimes, might break a project in CI.
Also, when you use a package-manager to install dependencies and you have in your node_modules
some version that satisfies the policy range, the package-manager will not try to see if there is a newer version. This creates an ever-increasing discrepancy between what’s on your developer’s disk and the CI.
The simplest solution was: package-lock.json
Whenever you add or update dependencies, by default — your package manager maintains, in addition to your node_modules
and the package.json
, another file — package-lock.json
. It documents the exact versions that landed on the disk with which you developed and tested.
Whenever this file is found, and you ask your package manager to install a project without modifying any dependency — it will use the these exact versions, no questions asked.
However, by doing that, in your name — it signs on a technical debt of security.
The ease of use of the package-lock.json
to produce deterministic builds led to the practice of including it as a part of the project sources, producing the same frozen dependency-versions on every CI, fixating the versions — and with them, the state of security technical debt.
That’s because locking your versions also means you do not get security fixes.
So why is package-lock.json
the default behavior?
Because too many projects are owned by teams which did not get a CI setup that is strong enough to proof their software in every build.
This is especially true for a very important driving factor of our industry: kick-off projects and start-ups, that need to kick ass and score quickly.
SemVer violations is also a factor—i.e. not communicating compatibility properly.
Thus, this is the default behavior of package-managers, assuming that projects will opt-out of it as they mature (see here).
However, until then , this puts them in a constant state of technical debt, or worse — fixed versions that are updated only after the boom...
Kicking the can down the road
Each time you update your dependencies and your package-lock.json
, you close this technical debt. However, as you update it in your SCM, you sign on a new one that will wait for you until the next update, kicking the can down the road.
How to evolve from that?
Level 1: Adopt a routine by which you periodically try to update your dependencies and test your software, and if all works well — commit it.
Level 2: Semi-Automated. There are good bots that hang on your SCM and open periodic pull-request to commit such updates (e.g. dependabot, renovate). However, these code-changes and the builds they produce are inspected by humans.
Level 3: Given a sufficient automated tests — these version updates can be accepted automatically.
…So now you have a mechanism that kicks the can for you.
But then again — if you managed to obtain such coverage, why wait for this periodic run of the bot and it’s pull-request? I daresay you do not need the package-lock.json
and all the rest of the premature parasite mechanisms around it…
So, Level 4: Mature CI setup.
Mature CI setup— You can get rid of the can
Continuous Integration (CI) — is a process whose core objective is to produce the most proofed software distribution. CI runs as many tests as we can afford to execute and maintain, starting from static-code analysis like lint and ending with integration tests (more depth here).
A mature CI means that every build of your software meets all its required criteria — including security and dependency scans.
When the entire software distribution is proofed, including it’s latest & greatest dependencies with sufficient coverage, it means — ZERO technical debt of Security.
Fixating versions without package-lock.json file
When to fixate? When a dependency version is breaking and/or is announced with a security issue and a fix is yet to be provided.
How to fixate? By modifying the SemVer policy’ — see the section about SemVer below.
How to not forget? A ticket in your issue-tracker, a story on your board, a recurring reminder in your calendar, a ceremony — whatever works for your team…
Is it a form of technical debt?
Totally YES. Period.
Fixating a version is signing on a technical debt, no matter if you do it on your
package.json
or usingpackage-lock.json
.
You should undo the fixation as soon as a version that fixes the issue is released, whether it is a bug or a security fix.
The thing is that package-lock.json
fixates ALL versions, and commits you to a routine of periodic updates. However, when you work with a sufficient coverage in CI and without package-lock.json
— you commit to revisit only the versions that you chose to fixate explicitly — which, to my opinion — is safer and more manageable.
The version-age fallacy
Projects should prefer the latest & greatest versions of their dependencies in order to include any security fix white hackers bring.
On the other hand, if a vulnerability in a dependency evaded detection and got published just now — the race is on between the black hats to find and use it, and the white hats to report it and close it.
The latter hand led teams to consider a policy that forbids using versions that are “too young”.
However, this also means that fixes are not accepted until they are old enough — so the same policy also means that when a problem is found, it stays in your software while the fix itself is waiting to be “mature”…
Sometimes you can go back to a version without the issue, just until the fix is “ripe”. But what if the exploit is there for a long while in versions you depend on, and has been detected and fixed just now?
One can implement a custom mechanism full of exceptions and cascading values, but that’s another dangerous adventure. It may become a thing nobody really understands its how and why. Instead, use a proven documented solution — like Snyk.
So, no. The best practice is use AND test your software with the latest & greatest backward-compatible versions. Continuously.
SemVer — Semantic Versioning Highlights
There is no care-free solution that can automate everything — but we can get close to that. While there will be times we’ll need human attention, SemVer is designed to make them rare.
SemVer
— i.e — semantic versioning — came up with a way that minimizes the care dependencies demand, delegating as much as possible to the inherent automatic processes. Here are the core principals.
Immutable
A version cannot be published twice. Once published, it’s immutable.
This allows a dependency cache and is a basis of the ecosystem.
Semantic
The version is not an opaque hash nor just a running number or date. It contains logical information:
The play of numbers conveys what changed, and how significant it is for compatibility
<M> . <m> . <p>[ - <pr>]
Let’s start from the easy:
<p>
for Patch. A patch increase conveys:
This version did not change any of it’s APIs or perform a significant change to its behavior-by-specs. A patch build may refine how it upholds it’s premises, solve bugs, fix security issues, but is:
💚 fully backward compatible.
- if you’re expecting a fix in a given patch — indicate it as the patch lower bound in your version policy.
<m>
for Minor. A minor increase conveys:
This version adds new APIs and/or functionality (e.g. support new feature-flags, overloading of existing APIs, expose more methods. etc.), but it still upholds its premises of all of its older APIs and behaviors. Meaning:
💚 fully backward compatible.
- An increase in minor is expected to reset the patch counter.
- If you’re expecting new features — indicate it as the minor lower bound in your version policy.
<M>
for Major: An increase in it means:
This version does not uphold ALL the premises of previous versions— i.e. there is at least one API, behavior or method that is:
⚠️breaking compatibility.
- An increase in Major is expected to reset the minor and patch counters.
[ — <pr>]
for an optional prerelease:
Prerelease versions end with a label that come after the 3 numbers, and do not support any dynamic version-resolving.
They do mean that an npm client will not get them unless it has called for them explicitly. They are simply filtered out from the (~
) and (^
) searches (see below).
This allows developers to publish versions that are served only to those who ask for them explicitly — for “preview” or “evaluation”.
e.g: "package": "1.2.3-rc.1"
Version Policies
Projects provide in package.json
a policy that expresses a range of one or more versions that can satisfy the policy, in correspondence with their tolerance for changes.
The most common examples are:
- Accept any version —
"pino": "*"
.
⚠️️️️This is the most promiscuous form —i.e. the least protective⚠️️ - Accept backward compatible new versions.
e.g. —"pino": "^8.7.1"
(package managers default to this policy)
i.e. any version with8.
for Major, whose minor & patch are7.1
or
higher. - Lose fixation — accept only patch increments.
e.g ."pino": "~8.7.1"
i.e — any version that starts with8.7.
whose patch is1
or higher. - Strict fixation — allow only one specific version.
e.g ."pino": "8.7.1"
i.e —only exactly8.7.1
.
There are more forms.
The SemVer spec is an easy one page read with a good FAQ.
You cannot consider yourself a pro without reading at least once the highlights and the FAQ. Pro-bonus if you survive the geeky parts too!
However, the best resource for the policy syntax is here.
It lets you do a lot, like blocking an offending range.
The major exception — 0 (zero) :
This marks early versions of unstable APIs.
⚠️Not unstable code — the code is expected to work!
It only means the form of the API MAY yet change in a non-compatible manner.
When the major is zero — there is no effective difference between a policy of (~
) and (^
).
Conclusion
The package-lock.json
is a simple mechanism to provide deterministic builds on the expanse of future-compatibility and security resilience. It is usually a symptom of immature code-bases, because it responds for needs of projects that do not have sufficient test-coverage. It works by overriding the entire SemVer version-policy mechanism on which the Node.js ecosystem is based, and as such it requires developer attention which is too often snoozed — or alas — ignored.
While a big part of this attention can be automated, once a sufficient coverage & maturity is accomplished — the mechanism around package-lock.json
can be dropped altogether, resulting in better security.