Migrating to Bazel from Maven or Gradle? Part 4 — Can Bazel work with Manyrepo?

Natan Silnitsky
Wix Engineering
Published in
4 min readMar 23, 2019

Bazel, as a source-dependency build tool with aggressive caching mechanisms, is optimized for building one big monorepo.

The more interdependent repositories you have, the more complexity is introduced for build triggering and management of external repo dependencies.

The problem is that popular source control protocols such as git do not support very large monorepos.

This in turn means that with bazel, there is a need to manage a “virtual” mono-repo that includes more than one repo, where there are source code dependencies between them.

External Repo Dependencies

Let’s say repo A has source dependencies on repo B.
On a clean build of repo A, bazel first clones repo B’s source files according to the commit hash defined in the WORKSPACE file:

Only then it builds the dependency targets found in B.
Lastly, it will build the targets from repo A.

bazel external repo build precendence

External Source Dependency Management

As long as you have more than one repo, there is a need to keep the “pointers” to each repo (the commit hashes) up-to-date.

You can specify a pointer to master branch like so:

But this can make it difficult to keep snapshots of the pointers for non-master branches.

If you want a built-in way to keep snapshot commits, this can be done by using bazel sync command.
This special command resolves the current commit hash and stores it in a resolved.bzl file.
This file’s content is used when fetching the repo’s source code.
In order to update the commit hash, this file has to be deleted first.

bazel sync is not optimized for performance (it re-downloads all of your dependencies). A faster way can be to get the head commit using git ls-remote.

As the repo list can evolve over time, it is better to have a server that can keep track of the latest set of repos and their versions. This server can poll github api for latest head commit and/or use github’s webhook notifications.

Triggering

The simple configuration is to define all of the repos without figuring out if they have a needed dependency. Bazel will only clone a repo if it has needed dependencies from local targets.

This mechanism has important implications for CI build triggering.

In order to make sure all repos’ builds remain up-to-date, once a new commit is pushed in one repo, its build and all other repos’ builds have to be triggered. This is done in order to know if any builds of repos have become broken and provide the exact commit and repo which broke it.

The more repositories you have, the bigger the chance for false triggering, where the repo does not depend on the changed target, but its build still gets triggered. If you only have a handful of repos, there is a much bigger chance of interdependence between them and a lower rate of false triggering.

For example, one of wix’s verticals repositories — wix restaurants — may have only a handful of dependent repos, but will still trigger all other repos.

Maintaining a dependency graph of the various repos and having build triggers determined according to it will certainly reduce false triggering. But this dependency graph is a moving target and is difficult to maintain.

Additionally, bazel employs extensive caching, which means that build times will be very short, even on false triggers.

Here is a quick summary of the advantages and disadvantages of each source code architecture:

Wix has consolidated the backend codebase to around 50 repos from close to a 1000 repos before, in order to keep the repo dependency mechanism simple.
In the future, we would like to consolidate the backend code to a mono-repo, but only when git will gain the ability to serve very large scale code bases, probably using a VFS (Virutal File System).

There are several promising projects in this front including VFS for Git by Microsoft and EdenFS by Facebook.

The previous post in the series was How to optimize local dev experience

The next and final post in this series is about How to avoid Jar dependency hell with Bazel

Thank you for reading!

Please also share on Facebook and Twitter. If you’d like to get updates, follow me on Twitter and Medium.

You can also visit my website, where you will find my previous blog posts, talks I gave in conferences and open-source projects I’m involved with.

If anything is unclear or you want to point out something, please comment down below.

--

--