Welcome to JS Dependency Hell
Lockfiles, resolutions, and deduplication, oh my! 🔥
This post has been a collaboration between Noviny, who has provided the deep lore around how this all works, and Sarah Federman, who has helped craft it into something understandable and readable.
So you have a project, and you’ve just realized you aren’t depending on the versions of dependencies that you thought were. Maybe you’ve realized you have 4 versions of one package installed, or your bundle size is 10x what you think it should be. It’s time to learn about how lockfiles affect your dependency resolution!
A light warning before we begin: Quick fixes to these problems always have the potential for unintended side-effects, and thoroughly solving these problems is slow, manual, and requires a large amount of systemic knowledge. The costs are often higher than the rewards. However, it is good to learn these things so that if and when a problem arises, you have the tools to help.
Note: For simplicity, this article focuses on the yarn
package manager for its examples, however most are compatible with any package manager. If you are using npm
, you can substitute yarn.lock
for package.lock
.
The Scenario
We have a project that depends on pkg-a
and pkg-c
. Pkg-a
and pkg-c
both depend on pkg-b
.
Following normal best practices, we are using a yarn.lock
.
When we first install pkg-a
and pkg-c
into our project, both of their package.json's specify that they depend on "pkg-b": "^2.0.0"
. At the time of our install, the latest version of pkg-b
on npm is 2.1.0
. In order to understand what happens now, we need to understand a little of how yarn lock works.
Yarn locks 101
The basic premises of yarn locks is that each package that is used in our repository has an entry in our yarn.lock
. An entry for a package tells us, given a semver range, which package version will end up being used. Every package has an entry in your yarn.lock
, including dependencies of dependencies.
An entry can also specify multiple version ranges if packages specify different (but compatible) ranges:
If there are packages that depend on the same packages but at incompatible ranges, we will install multiple versions, and it will look something like this:
How yarn.lock
works on install
So back to our example. We’ve cloned a fresh repo and ran yarn install
, which generated a yarn.lock
file. Our project depends on pkg-a
, and pkg-c
, which both depend on pkg-b
. Our yarn.lock
now looks like this:
What this says is that when yarn encounters a package that specifies it wants pkg-b@^2.0.0
, the version that will be used will be exactly 2.1.0
.
What happens to your yarn.lock
when you update a package
Exciting! A new version of pkg-a
is available and we want to update our repository to use it. We use yarn upgrade
or yarn upgrade-interactive
to update to the new version of pkg-a
. There is no new version of pkg-c
available, so we leave it as is.
As part of this new version, the author of pkg-a
decided to update their dependency on pkg-b
from ^2.0.0
to ^2.1.0
. So pkg-a
now specifies they depend on pkg-b@^2.1.0
and pkg-c
stays the same, specifying that it depends on pkg-b@2.0.0
. At this point in time, the newest version of pkg-b
on npm is 2.2.0
, not 2.1.0
as it was when we originally installed.
The semver ranges of both dependencies on pkg-b
are still compatible, as it was only a minor update. What do you think will happen in our yarn.lock
?
If you expect pkg-a
and pkg-b
to continue to share the same version, you’d expect the yarn.lock
to look something like this:
What actually happens
There is no existing yarn.lock
entry for pkg-b@^2.1.0
, which is what the new pkg-a
specifies in its package.json, so yarn will create a new entry for ^2.1.0
which will use the latest compatible version on npm. Yarn won't change the version that the previous entry was using, because yarn wants to protect us from bugs arising from consuming new updates. Our yarn.lock
now looks something like this:
This is normally fine, and helps to protect us against bugs. However, it can mean that bundle sizes increase over time, especially in the case of larger dependencies (such as with a heavy editor package).
What if we delete our yarn.lock
and then yarn install?
If we delete our yarn.lock
and then run yarn install
, any packages that depend on pkg-b
will update to the latest version available on npm that is compatible. However, it will also do this for any other packages across our repository, possibly introducing unintended bugs.
You keep saying “unintended bugs”
Lockfiles are designed to keep us safe. Every time we change what version of a dependency is installed, we are running different code. If everyone has followed semver correctly, changing what is installed within semver-compatible ranges should be fine. However, errors are frequent enough that having a lockfile which ensures the same exact dependencies are installed every time will, in practicality, prevent us from encountering bugs due to dependencies changing.
What deduplication does
Deduping or deduplicating is the practice of attempting to minimize shipping multiple versions of the same package, leading to smaller bundle sizes. If we were attempting to deduplicate, we would expect our yarn.lock
to share versions of dependencies when their semver ranges were compatible. Yarn (v1) does this by default on install, but not on upgrades.
A powerful tool for accomplishing this is yarn-deduplicate. When we run yarn-deduplicate, it will change our yarn.lock
to look like this (as we may have originally expected):
This will only update entries for packages where we are currently shipping multiple compatible versions, leaving the rest of the entries as is. This is in contrast to deleting the yarn.lock
, which will update all entries to the latest compatible version from npm in addition to the deduping that we want. Using yarn-deduplicate instead means we have a smaller surface area of introducing unintended bugs. You can also use yarn-deduplicate with the --packages
or --scopes
flags to dedupe specific packages or scopes (like @yourdesignsystem
).
If you’re using yarn 2.2+, there is a command built in to the yarn CLI that is similar to yarn-deduplicate. The docs for that live here: https://yarnpkg.com/cli/dedupe. I haven’t played with this due to the non-backwards capability of yarn 2, so I can’t comment on how well it works, but it looks very close to yarn-deduplicate.
What about using yarn resolutions?
Sometimes you will depend on a package which depends on another package, and when this package resolves to its highest semver compatible range, causes your code to error.
Here’s an example:
- You depend on
babel@^1.0.0
- Babel depends on
foo@^1.0.0
The latest of foo
is 1.5.0
and when it is installed, our code breaks, but we know that if we install 1.4.9
instead, our code will run.
We can’t set just the dependency of foo
to foo@1.4.9
, as babel
controls its own dependencies. We can, however, use a yarn resolution in our package.json to force babel to use the version we want:
This will resolve babel
’s version of foo
to 1.4.9
, separate to any other packages that depend on foo
, no matter what range they specify.
What if there are multiple packages using foo
and we want to force all instances of them to resolve to 1.4.9
?
This method would resolve all instances of foo
anywhere in our dependency tree to 1.4.9
, and may introduce many other bugs. The only time we would want to do this is if we are trying to enforce a repository-wide singleton (such as react) and there is no other way to escape this.
Do not use resolutions to deduplicate packages — this can force two packages that depend on incompatible semver ranges of a package (such as two different majors) to get the same version. This has a very high likelihood for causing hard-to-debug problems, often the kind of bugs that lead to painful node_modules
spelunking.
There is, however, one exception to the above rule, and it is an interesting hack.
The temporary yarn resolutions hack
If we want to force a dependency’s dependency to resolve to a newer or specific version, there is an interesting way using yarn resolutions. This is not the intended usage of yarn resolution
, but it technically can be used in this way. Let us continue with the previous example with our babel/foo
package. Say we've set:
And we then run yarn install
to update our yarn.lock
. This, as we said above, is risky, as this means if we have different packages depending on different majors of foo
, it would likely break because we force all versions to use exactly 1.4.9
no matter what they’ve specified in their package.json.
If we then delete the above resolutions field from our package.json and run yarn install
again, dependencies of foo
that are incompatible with 1.4.9
will be re-resolved to a compatible version. However, anything that was compatible with 1.4.9
will continue to install 1.4.9
instead of reverting back to the version we didn’t want.
Like any time you are updating the package versions in your yarn.lock
, this can create unintended consequences for other packages that depend on foo
at a compatible range, so some caution and testing is still needed.
Thank you to Nicolas Ronsmans for bringing this alternate approach to our attention!
Each approach has trade-offs:
Yarn upgrade or yarn upgrade-interactive:
- Pro: This strategy has the lowest likelihood of introducing bugs when upgrading a package.
- Con: Version spread. You will more than likely end up bundling multiple copies of packages that could have been deduplicated, and your bundle size may grow over time.
Delete your yarn.lock
and then yarn install to regenerate it
- Pro: You’ll get the latest versions that satisfy semver ranges from npm, and packages that can share a range will do so.
- Con: This is basically the same as not having a
yarn.lock
, so your packages will end up running different code when you reinstall and may introduce unintended bugs.
Yarn-deduplicate
- Pro: You will end up with less versions being bundled, and it also includes flexible options. It does not require a network connection as it uses the highest versions you already have, and it can be used in a CI step if desired.
- Con: Packages you did not upgrade will use new versions, which can create a higher burden for testing if those new packages have failed at using semver correctly. It also involves adding a new CLI tool to your repo, if that’s something you care about.
- It’s also important to note that your results will depend on how it is used. You can run the library automatically (on precommit, for example) or choose to dedupe specific packages or scopes, leaving it as a more manual process. There’s also different deduping strategy options. Read the docs for more information.
* These things probably apply to yarn dedupe
in yarn 2, which I have not tried.
Temporary yarn resolutions hack
- Pro: You can target deduping on package-per-package basis without installing additional libraries.
- Con: This is not an intended use of yarn resolutions, and may still create unintended consequences. You must also remember to delete the resolution after running
yarn install
else everything is a mess. It’s a manual process, so you’ll need to pay attention to your bundle size andyarn.lock
to know if deduping is necessary and where.
Bonus round!
What about using Peer dependencies?
Peer Dependencies are incredibly difficult, and as a whole should be avoided unless there is no alternative. Using them to help with deduplication will lead you down a rabbit hole of future problems.
What is a peer dependency?
A peer dependency is a dependency that a package requires, but is not installed by the package that requires it. So if we have a package that has a peer dependency on react
at ^16.0.0
, it will not work unless react is also installed by the consumer. If we wish to use a package that has a peer dependency on react, our package.json will need to look something like this:
If we are using 5 packages with a peerDependency on react
, we still only get one copy, as none of them will install their own version of react.
The problem with peerDependencies is that you can only depend on one version, even if different packages you depend on have peerDependencies on the same package but specify different version ranges. Say myPackageThatDependsOnReact
updates its peerDependency requirements to react@^17.0.0
, and is depending on things that are broken by the major change. We have installed react 16.0.0
to satisfy the other packages' peerDependencies, so we are now stuck. We cannot upgrade to the newest version of myPackageThatDependsOnReact
until all four other packages also update their peerDependency and code to be compatible with react 17.0.0
. With a peerDependency, everything is beholden to the lowest major version of the peerDependency in our repository.
React is good to have as a peerDependency as it is a “singleton”. This means that loading multiple copies of the react library within a react app will cause it to crash. It is required that only one version of react be resolved within an app, so using a peer dependency enforces this.
Conclusion
Congratulations, you have walked into Dependency Hell and made it out alive! We hope you enjoyed learning some deep lore around npm dependencies and how to manage them. Now people will ask you questions about it forever. Good news, you can now send them this article!
Tldr; is to avoid these issues if you can, but if you can’t, make sure that you understand the tradeoffs.
Cheers! ✨✨