Why you don’t need a mono-repo but should just build from source

Emmanuel Debanne
Mar 24, 2020 · 13 min read
There is a lot of hype on mono-arch bridges, but this 2000 years old poly-arch bridge might still deserve our attention. (By Benh LIEU SONG — Own work, CC BY-SA 3.0, Link)

Introduction

In 2017, I had the chance to attend the Devoxx France conference. I heard many interesting talks dedicated to the trendy topics of the moment such as containerization or micro-services architecture. One thing surprised me though: Several times “mono-repos” were presented as the new good practice to adopt, allowing to avoid issues such as having to maintain multiple versions of libraries. A typical example was the talk “Why your company should organize its codebase in a single repository?

“Mono-repo”, a misnamed concept

The ability to build a whole codebase from source is widely advertised as requiring a unique repository. For example, the famous reference website on Trunk Based Development correctly presents the goal — to put “all applications/services/libraries/frameworks into one trunk” and to “force developers to commit together in that trunk, atomically” — but it does it under an entry called “mono-repos”. When promoting Continuous Integration, ThoughtWorks — well known for its Technological Radar — presents as a prerequisite to “maintain a single source repository”. On Medium, the tag “Monorepo” became really popular, especially to talk about building front-end projects from source. In the English Wikipedia, an article for Monorepo was created in 2018, listing the advantages “over individual repositories”. Symmetrically the term “poly-repo” is used to express the opposite concept of “not built from source” (e.g., the article Monorepo vs. polyrepo).

The true goal: frequently upgrading the dependencies

First, it is important to know what problem we are trying to solve by proposing to “build from source” the whole codebase.

Internal and external components

By way of illustration, let’s schematize a component (typically a library) named “YourComponent” that your team owns. This component depends on dependencies and is itself the dependency of some clients. This can be represented with the following schema:

Upgrading a dependency

After adding a dependency to your component, you usually stick to a particular version of it for several days, sometimes months.

  • There is a bug on the version you use and you want to upgrade to the fixed version. This is often the case for internal dependencies, business or infra oriented. This is usually not the case for external dependencies because their behavior is more stable. Even security patches are not that frequent and don’t always concern your code.
  • The team owning the dependency wants to change the API. This is usually not the case of external dependencies because they want to keep their users and don’t give them a reason to switch to other implementations. This can be the case of internal libraries if your infrastructure is changing a lot. For example, a successful company that needs to scale will regularly modify its infrastructure.

The cost of infrequent upgrades

An important thing to be aware of is that the upgrade of dependency will have a bigger impact than just the one due to the reason for the upgrade (such as a new feature or a bug fix). With version upgrades come other changes: New features, other bug fixes, and more importantly some API changes.

Advantages and disadvantages of “building from source”

Advantages

We came to the pattern “build from source” in order to get frequent upgrades of dependencies. So it is not a surprise that many advantages come from these frequent upgrades:

  • There is no need to manually update versions of dependencies. It is now done at each build.
  • When a bug is discovered in a dependency, you don’t need to dig into a long list of commits to find the culprit one. The issues are discovered early, while they are still easy to fix.
  • You are more eager to modify your dependencies as you don’t need to go through the process of a merge request, waiting for a released artifact, and finally bumping the version of the dependency in your component.
    Also, the code you depend on can be modified and tested at the same time as your own code.
  • It avoids the “Diamond dependency hell” problem: a component cannot depend on 2 different versions of the same dependency.
  • It is easier for a dependency to check before merging if it breaks its clients.
  • The code of your dependencies can always be rebuilt because it is continuously built/tested. You cannot depend anymore on libraries that have been built a long time ago and whose compilation/validation is now broken.

Disadvantages

The drawbacks are of a different kind. “Building from source” implies a higher cost in term of infra and support:

  • Tooling: Usual tools (such as build scripts and code review tools) are not sufficient to manage this new way to build and maintain the codebase.
  • Components are more difficult to change as all clients must be changed right away. But this is just a short-term impact vs a mid-term one.

The specific case of external dependencies

We said that upgrades are usually more frequently needed on internal dependencies, and “building from source” makes them instantaneous. As for the external dependencies, you generally can’t afford to move their code into your own codebase. The maintenance would be too high and can only be amortized in the biggest companies — such as Google where “an area of the repository is reserved for storing open source code” ( Why Google Stores Billions of Lines of Code in a Single Repository ). Consequently, they can’t directly benefit from “building from source”. They still require a manual process where a commit upgrading a version number is pushed. That does not mean we should not try to reach the same goal as for internal dependencies though: to upgrade them as frequently as possible.

Benefiting from frequent upgrades

Frequent upgrades on external dependencies provide all the advantages previously listed about “building from source”.

  • Issues related to “diamond dependencies” are not totally avoided (unless you succeed to use a unique version) but just mitigated (as you tend to use recent versions only).

Realizing frequent upgrades

Unifying the versions

  • Gradle chooses the most recent,
  • MSBuild will warn you that it takes the highest version from “PackageReferences”, but you won’t be notified if you still use the old “References”,
  • Bazel has no strategy and forces you to declare the winning version.

Conclusion

As we have seen, the real need is not to use a single repo, but to get rid of upgrades between components.

Criteo R&D Blog

Tech stories from the R&D team

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store