PubGrub: Next-Generation Version Solving

Image by @Th3RoadVirus

If you’d peeked in on me at work five months ago you’d find me with my nose in a textbook reading up on cutting-edge academic research on constraint-solving problems. Three months ago I was applying that knowledge to code, drafting and tweaking a new algorithm that united that research with a real-world tool. One month ago I was adding the last few features and polishing it up.

Today, PubGrub goes live: my new version solving algorithm that uses fancy-sounding techniques like “unit propagation”, “logical resolution”, and “conflict-driven clause learning” to run faster and explain errors more clearly than any other version solver. It’s part of the pub package manager bundled with the Dart 2.0.0-dev.44.0 release, and I hope one day it will find its way into package managers for all sorts of programming languages.

What is Version Solving, Anyway?

The package manager is an important part of every modern programming language. It’s the tool that fetches an application’s dependencies, its dependencies’ dependencies, and on and on until everyone’s dependencies are satisfied. And the heart of any package manager is its version solving algorithm, which figures out which versions of each package to install while ensuring that every package’s dependencies are satisfied.

Unfortunately, coming up with a good version solving algorithm is hard. Really hard. NP-hard. Which, for anyone who’s not an algorithms nerd, means it’s so difficult that it’s a major unanswered question in computer science whether an efficient algorithm even exists¹.

Suppose your app depends on a menu package whose latest version, 1.5.0, depends on dropdown ≥2.0.0. For some reason dropdown 2.3.0, the most recent version, can’t be selected (let’s say it depends on icons ≥2.0.0 while your app depends on icons <2.0.0). So the solver tries the next-best version, dropdown 2.2.0. Same problem. And same with dropdown 2.1.0 and 2.0.0. Dang.

The package graph including the root package, menu, dropdown, and icons.

The next step is to try menu 1.4.0 and see if it’s any better. The solver goes through all the versions of dropdown again to see if they work, but none of them do. Same with menu 1.3.0, 1.2.0, and 1.1.0. Finally the solver reaches menu 1.0.0 which allows dropdown 1.8.0. This, it turns out, is compatible with the app. But getting there took a lot of iterations: one for each version of menu × each version of dropdown. And it would have taken even more if there were more packages involved.

It’s tempting to say, “we already figured out that we can’t use dropdown ≥2.0.0, so just don’t check again”. But it’s not that easy. Any version of any package might have a dependency on any other package, which means that in general every time you choose a different version of one package you have to rethink all your logic for every other package. That’s a recipe for an exponential explosion of search time.

So What Does PubGrub Do?

Buckle in, because we’re about to get academic! We’ll start with a couple definitions: the most basic unit that PubGrub works with is called a term, which is a range of versions of a single package that’s either required or forbidden. For example, menu ≥1.1.0 is a term, and so is not dropdown ≥2.0.0. A term is satisfied if it matches the version of the package that’s selected. menu ≥1.1.0 is satisfied if menu 1.2.0 is selected, and not dropdown ≥2.0.0 is satisfied if dropdown 1.8.0 is selected (or if no version of dropdown is selected at all).

We use terms to define incompatibilities, which is just a set of terms that may not all be satisfied at once—that is, if a set of packages satisfies all the terms at once, we know it isn’t part of a valid solution. We use incompatibilities to represent (among other things) dependencies between packages. For example, the incompatibility {menu ≥1.1.0, not dropdown ≥2.0.0} means that it’s invalid to select a version of menu that’s ≥1.1.0 and also not to select a version of dropdown that’s ≥2.0.0. In other words, it means that menu ≥1.1.0 depends on dropdown ≥2.0.0.

The terms in an incompatibility are incompatible regardless of any outside context, which means they can represent facts that PubGrub can use to avoid re-doing the same work over and over again. The trick is figuring out how to generate incompatibilities that are both true and likely to be useful. It’s easy to generate incompatibilities that represent packages’ dependencies like the example above, but we also want to derive new incompatibilities from existing ones. For example, if we could derive the incompatibility {dropdown ≥2.0.0} (indicating that dropdown ≥2.0.0 should never be selected) the first time we look at dropdown, we could skip a lot of redundant work.

That’s the clause learning in “conflict-driven clause learning”. But before we dive into that, we have to get to the “conflict” part, which comes from running PubGrub’s core loop.

The Core Loop

Most of the time it’s running, PubGrub alternates between two phases while building up a tentative list of package versions that might be used in the final set of versions.

The first phase is unit propagation, which combines the list of selected package versions with the set of incompatibilities that represent facts we know are always true to find terms that must be true given the package versions we have selected. For example, if we’ve selected menu 1.4.0, we can look at {menu ≥1.1.0, not dropdown ≥2.0.0} and see that dropdown ≥2.0.0 must also be true (because otherwise all the terms would be satisfied, which isn’t allowed).

These derived terms also feed back into unit propagation. Once we’ve derived dropdown ≥2.0.0, we can use it along with the incompatibility {dropdown ≥2.0.0, not icons ≥2.0.0} to derive icons ≥2.0.0. This continues until there are no more derivations to make, at which point PubGrub moves on to the second phase.

The second phase, decision making, happens when there’s no more unit propagation to be done but there are still dependencies left unsatisfied. We choose some version of some package and add it to the list of package versions. Any version will work as long as it matches all the terms we’ve derived from unit propagation, but typically a package manager will want to pick the most recent version of some package (chosen by a heuristic; pub prioritizes packages with the fewest total versions).

This is also where PubGrub adds incompatibilities representing a new package’s dependencies. It’s important to add them lazily the first time we encounter a package rather than adding all dependencies for all packages at the very beginning, since we don’t know ahead of time which packages will end up being used or which versions will be relevant.

If everything goes well, the core loop will find a set of package versions that satisfies every package’s dependencies and the user will set off to work running their code. But we don’t live in a world where everything always goes well, and that’s why we have conflict resolution.

Conflict Resolution

Sometimes during unit propagation, PubGrub will derive a term that satisfies all the terms in an incompatibility. For example, when it derives icons ≥2.0.0 that combined with the root package’s own version would satisfy {root any, not icons <2.0.0} (the root package’s dependency on icons). This is a conflict, and it triggers a special third phase to resolve it².

In contemporary version solvers, the equivalent of conflict resolution just involves jumping back to the most recently selected package version that might have caused the conflict and trying a different version instead. PubGrub does the same thing, but it also finds the root cause of the conflict and records that as an incompatibility. This incompatibility helps future unit propagation phases avoid running into the same conflict again and again.

The root cause is generated by iteratively combining incompatibilities using a logical rule of derivation known as resolution. This rule says that if both “P or Q” and “R or not Q” are true, “P or R” is true as well. In terms of incompatibilities, this means that given {S, T} and {U, not T} you can derive {U, S}.

We start by combining the incompatibility that triggered the conflict ({root any, not icons <2.0.0}) with the incompatibility from which we derived the term (icons ≥2.0.0, derived from {dropdown ≥2.0.0, not icons ≥2.0.0}). These combine to give us {root any, dropdown ≥2.0.0}, which means that the root package isn’t compatible with dropdown ≥2.0.0.

We continue by combining it with the cause of the most recently-derived term that satisfied it—in this case, dropdown ≥2.0.0 was caused by {menu ≥1.1.0, not dropdown ≥2.0.0}. This combination gives us {root any, menu ≥1.1.0}. Since this was satisfied by a package version we chose during decision making rather than a derived term, we take it as our root cause, jump back before we made the decision to select menu 1.4.0, and go back to unit propagation.

Given this new incompatibility, unit propagation is able to derive not menu ≥1.1.0 and decision making selects menu 1.0.0 without having to look at incompatible versions of dropdown ever again. That’s the power of PubGrub.

But What If It Fails?

Efficiently finding a solution is great, but perhaps even greater is PubGrub’s ability to explain to a user why it couldn’t find a set of packages that satisfy one another’s dependencies. The reason version solving fails can get very complicated, but no matter how tangled it is PubGrub is able to produce a clear step-by-step explanation of exactly what went wrong.

Back to our example: you’re trying to add intl ≥5.0.0 to your app but the dropdown versions <2.0.0 all depend on intl <4.0.0. That’s bad news, because it means falling back to dropdown 1.8.0 won’t work after all!

The package graph, now including intl.

Contemporary version solvers would only be able to report this as a problem with intl, when the real issue is that your app needs to update its dependency on icons. Here’s what PubGrub has to say:

Because dropdown >=2.0.0 depends on icons >=2.0.0 and root depends
on icons <2.0.0, dropdown >=2.0.0 is forbidden.
And because menu >=1.1.0 depends on dropdown >=2.0.0, menu >=1.1.0
is forbidden
.
And because menu <1.1.0 depends on dropdown >=1.0.0 <2.0.0 which
depends on intl <4.0.0, every version of menu requires intl
<4.0.0
.
So, because root depends on both menu >=1.0.0 and intl >=5.0.0,
version solving failed.

This makes it clear exactly why a more recent version of dropdown couldn’t be selected, as well as why selecting the older version didn’t work. From this output, the user can figure out that they can solve the issue either by supporting a newer version of icons or an older version of intl.

This detailed error output is made possible by PubGrub’s conflict resolution. Every time we use resolution to combine two incompatibilities, we add pointers from the new incompatibility to its parents. By the time we reach an incompatibility like {root any} that indicates that there aren’t any solutions at all, we can use those parent pointers to form a directed acyclic graph of incompatibilities which represents a proof that the root package’s dependencies can’t be satisfied.

The derivation graph for the solver failure.

The graph for this example, pictured above, is just a straight line. But more complex failures can produce more complex graphs. They can have branches and sometimes even multiple derivations from the same incompatibility.

To produce user-friendly output, PubGrub does a depth-first traversal of this graph and converts it into plain English. Since this conversion is all about making things readable by humans, it involves a lot of fuzzy heuristics—things like combining incompatibility descriptions in clever ways, folding together steps that would seem too obvious, and adding line numbers when a purely linear argument won’t suffice.

I Want PubGrub!

So far, PubGrub has only been implemented in Dart’s package manager, pub. But it’s designed to be a general-purpose algorithm that could be implemented anywhere that’s solving the same problem. So file an issue on your favorite package manager’s GitHub repo and ask them (very politely) to add support for PubGrub.

One major benefit of the algorithm is its flexibility. I’ve only written here about the most basic sort of version solving, but PubGrub is capable of working with any relationships between packages that can be represented as logical relationships between package ranges. Pub supports features like locked versions, forced upgrades, SDK version constraints, and dependency overrides. It’s easy to customize the behavior by tweaking what incompatibilities are supplied for each package.

If you work on a package manager yourself and you’re interested in giving PubGrub a shot, I recommend reading the full documentation. It dives much deeper into the details of the algorithm than I can do in one blog post, including both how it works and why we can be confident in its answers. If you have any questions, feel free to file an issue so that I can improve the documentation after I answer it. You can also reach out to me on Twitter or by email.

Acknowledgements

I put plenty of work into PubGrub, but I couldn’t have done it on my own. Thanks to Martin Gebser, Roland Kaminski, Benjamin Kaufmann, and Torsten Schaub for writing Answer Set Solving in Practice, the book detailing the constraint-solving algorithm that I adapted for use with package ranges. Thanks to Bob Nystrom for reviewing both the algorithm and the code that went into implementing it. And thanks to my brother Jeremy Weizenbaum, whose job is literally cooking pub grub, for suggesting the name.


  1. Most researchers believe there is no universally efficient algorithm for problems like version solving, but proving it is so difficult that no one’s managed it despite a million dollar bounty.
  2. In practice, PubGrub is clever enough to handle this situation entirely using unit propagation and backtracking, without actually resorting to conflict resolution. But the situations where it does recover using conflict resolution are too complex to make for a nice example, so I’m fudging a bit here.