Scaling our dependency graph

Published in

Inside Doctrine

6 min readMar 24, 2022

Like many projects, ours at Doctrine rely on long lists of dependencies. Since in Python you need to guarantee compatibility between each and every package, maintenance turns out to be time-consuming and quickly unmanageable.

The exercise is not about choosing a dependency manager, but rather reaching a state where each of our installs is stable and developers feel safe updating or adding dependencies. The expected result is a dependency graph that will scale smoothly with more dependencies, more services, and more developers.

We’ve monitored the evolution for four years, during which we’ve added tools and increased the number of dependencies.

Reaching stability

Early 2020 context

no dependency manager
non-deterministic installation
any update is a pain

At that point (see charts below) our dependency graph was unstable and every developer would fear having to change any part of it. Failures and reverts were frequent. On top of that, we were limited to spot upgrades, where we would try to upgrade packages one by one only.

We decided to invest time in solving the graph, aiming for stability first, and eventually open discussions for more package managers' features.

Our definition of stability

installations are deterministic, meaning we have an exhaustive list of the packages we’re installing and their exact versions
a dependency compiler is running so we can guarantee compatibility
any upgrade targets the whole graph and not just isolated dependencies

Tooling

pip-tools got selected to reach stability quickly, keeping in mind that once stability was attained reached we could reassess our needs and fancy more features in package managers selection.

Why pip-tools? Among all its capabilities, its pip-compile tool features a stable dependency graph compilation (as of August 2020, we had stability issues with pipenv and poetry). While remaining simple this helps us tackle 2 of our biggest pains: 1. conflicts between packages and 2. deterministic and exhaustive installation. We went straight to that option by experience in order to get our stack stable, and eventually, we would benchmark more tools such as poetry and pipenv.

At this point requirements.txt becomes a lock file. It contains exactly all the dependencies that are going to be installed and guarantees they are fully compatible. Really all of them, the ones we’re requesting and the ones that are needed by the ones we’re requesting.

Generating a first stable version

Once the tooling got set, we transformed our requirements.txt into a requirements.in file and ran pip-compile. Our first set, at that time, had lots of incompatibilities and conflicts. This failed the dependency compilation and we followed the steps below in order to solve incompatible requirements.

The operations performed:

1. isolate the dependencies you’re not directly using in your code. Say you’re not directly importing jinja2 in your code but you’re using flask , that is requiring jinja2 . In that case jinja2 should not be in the requirements requested by the developers, requirements.in .

2. clean all the arbitrary versions. Ideally requirements.in contains only versions specifiers or even none. See the official PEP about version specifiers: https://www.python.org/dev/peps/pep-0440/#version-specifiers. Some packages may need arbitrary versions, but they should be kept at a strict minimum.

3. solve the arbitrary and internal code dependencies. The code should never rely on packages internals. Typical example: nothing outside of Pytorch should ever import https://github.com/pytorch/pytorch/blob/master/torch/_classes.py. This module is internal to Pytorch and should remain as is.

4. compile again, follow pip-compile results and loop through 1 again.

As cumbersome as these 4 steps may look, they’re a leap towards stability. And a great refactoring exercise.

The impact

pip-tool was activated around August 2020 and we tracked a list of metrics before and after activation.

About our dependency graph

the number of conflicts: we went from a way-too-high number to 0. A typical conflict is raised when package A needs version 2 of package C and package B needs version 1 of package C. And your code needs both packages A and B.
the number of late majors: The ratio of packages for which we don’t install the latest major version (minors and patches are not accounted for). The goal is not about fashioning the most recent versions, but rather guaranteeing we’re working with the most maintained versions of our dependencies. We went from ~75% to ~10%.
the number of undetermined packages and versions: this is illustrated below. The measure is based on comparing the requirements.txt committed at that time and the result of pip freeze . The difference would be the number of undetermined packages installed. As innocent as it sounds, this prevents reproducibility as the pip install operation would lead to different installations in different systems at different times. We’ve reached almost 300 (200 for one service alone) before the shift. We have a stable 0 now. 300 undetermined dependencies for about 100 desired dependencies at that time. In other words, we had no knowledge of 75% of our installations back in 2020.

About the developer experience:

protocol: from a cumbersome review process with a high reverting rate to an automated flow. We used to go through the infra team for testing a new package or the upgrade of an existing one, with a high rate of reverts and unexpected collateral damages. Today updates are painless, any developer can perform them and they’re automatically compiled.
update frequency: illustrated below through the number of times requirements.txt got updated. We’ve performed a 3X in the year 2020 alone, and a 6X from 2019 to 2022. We expect to double or even triple again in 2023.

time spent on solving dependency issues and conflicts: illustrated below through the ratio of, among all commits related to dependency management, commits that aim specifically at solving issues, such as conflicts between dependencies or even conflicts with Python version or the system. In 2019 and 2020 we were at 40%. Updating dependencies was unquestionably expensive as 4 times out of 10 we’d need to solve issues. After dropping by several orders of magnitude within 2020, we’ve got a good 3% in 2021, 1.5% this far in 2022 and we aim to remain as low as 1% or less from now on.

These last measures are key for the team:

developers spend less time on technical chores and more time on product features
we’ve reached a satisfying level of stability and we can guarantee reproducible installs

Paving the way for internal packages

The work above was accompanied by a collective desire to move towards the construction of private libraries and packages. Independent programs or services are (most of the time) isolated in their repositories and they rely on private packages. Although reorganizing solves local services dependencies -we have less of them since each service can be specific to its function and therefore very specific on its dependencies- the dependency graph could turn into hell very quickly.

We’re still monitoring the above metrics and we can see the effect of the added complexity of the graph that’s being generated.

The cumulative relationships count, or added complexity, or coupling, is expected to follow an n² curve in the future. 2021 and 2022 figures show that we remained stable through a noticeable growth in complexity: still 0 undetermined versions, and very low time spent on dependency solving.

The capability to produce internal libraries got accelerated by setting a private package index. Before then we would need to go through calling our internal libraries directly on Github. As Github does not support python package indexing yet (roadmap item is still open: https://github.com/github/roadmap/issues/94), we would need to require pre-determined versions of our private packages. This is a heavy coupling as if A requires B, B requires C, and A also requires C, you need to match versions exactly. Enabling a private package index removed these constraints.

The future

Our main objectives are met:

we get an exhaustive list of packages that are going to be installed, which guarantees reproducibility,
we have improved the developer experience,
we can upgrade a lot faster

Some services still rely on a long list of dependencies, which might affect the build time negatively. Our explorations are, and they’re not exclusive, splitting the services even further and leveraging a layered set of requirements:

split them and install them as “extra” (see https://peps.python.org/pep-0508/)
split them by frequency of upgrades. This is particularly relevant when you build docker images. Basically, some ML models are tied to the version of the library, so we don’t need to upgrade them as often. These could be gathered in a set of requirements, and the ones that need to be updated more frequently can be gathered in another set of requirements. By installing them in two (or more) steps in a docker build, you can leverage the docker cache and have a faster build.

As we’ve reached stability and we can safely scale, we can now look into richer features in package managers.