Migrating to Bazel from Maven or Gradle? Part 1 — How to choose the right build unit granularity

Natan Silnitsky
Wix Engineering
Published in
4 min readFeb 9, 2019

Bazel is very flexible when it comes to choosing the build unit granularity.

Unlike Maven and Gradle which have a convention based notion of a module (Maven) or a project (Gradle) where all the relevant files reside, each Bazel build unit — called a target, specifies which files are part of it.
These units can range from one single file to an entire Repository file set.

There are two popular build unit organization options:

  1. keep Maven/Gradle granularity of module/project builds
  2. use Java package granularity — the 1:1:1 approach

There is a trade-off here between ease of migration and performance boost

Module Granularity

If you keep the Maven/Gradle module-level granularity then you will have a relatively easy time encoding your current pom/gradle files in bazel’s format.

Here’s an example of a pom.xml file and gradle file vs. BUILD.bazel file which demonstrates how easy it can be to change formats in this case.

Almost all the information you need is found in the original config files themselves.

For maven, Apache Maven Reader is a great tool that helps with deserializing the maven model.
Bazel-deps is a tool that helps with making 3rd party dependencies available in the Bazel workspace (binary dependencies have to be fetched pre-build to adhere to Bazel’s hermetic build design)

The flip-side with keeping the module granularity is the performance hit. Bazel is optimized for fine-grain build units. Any change you do to your code anywhere in the module will mean re-building and re-testing the entire module and potentially many of it’s dependent modules. In addition, because there is only 1 test target — all of the tests will run sequentially, instead of in parallel.

Package Granularity

If you want fast build times from day one, you have the option of switching to Java package granularity, meaning 1 build target in 1 build file for 1 Java package (the 1:1:1 principle).

With this approach you gain the full benefits of Bazel, where on average there will be much higher cache hit rates and a much higher degree of test run parallelism.

You will need to work harder to prepare the build file configurations, as the pom files will not be enough. What is required here is some source code analysis tool that will find the dependencies at the package level. There are several tools to help with analyzing Java source code (jadeps and BUILD_file_generator) but they are quite basic and limited in their abilities.

For instance, given the following BUILD file which represents code of a single Java package,

running Jadep will add all missing deps that the Java files in this package need to build successfully.

Another disadvantage is the needed upkeep of many more build configuration files when source code is changed.

This area requires strong tooling and Wix is actively working on tools that automatically add missing dependencies for Java and Scala

Here is a quick summary of the advantages and disadvantages of each method:

At Wix, we chose package granularity, as drastically reducing build times was of paramount importance to us.

We have created an automatic migration tool called Exodus. It parses file level dependencies that are created by Zinc Maven plugin and creates a dependency graph. Then it transformed the graph to package level dependencies and creates appropriate targets from it (one target per package or bigger target aggregates in case of cyclic dependencies between packages).

We have recently open sourced Exodus. you can find more information about it here:

File Granularity

For completeness, there is also the option of build unit per Class/File.

I would not recommend this option due to the high frequency of maintenance that would need to be made for each build configuration.

Another disadvantage is the performance penalty that can happen from analysis overhead and redundant additional startup costs for integration tests suits.

A talk I gave at JEE Conf based on this blog post series

The previous post in the series was the intro

The next post in this series is about
How to Decide on CI Server and remote execution / caching

Thank you for reading

Please also share on Facebook and Twitter. If you’d like to get updates, follow me on Twitter and Medium.

You can also visit my website, where you will find my previous blog posts, talks I gave in conferences and open-source projects I’m involved with.

If anything is unclear or you want to point out something, please comment down below.

--

--