Automate Benchmarking Android Build with Gradle Profiler & Gradle Enterprise

Published in

Traveloka Engineering Blog

9 min readSep 8, 2022

Male photo created by nikitabuida — www.freepik.com

Editor’s Note:

Today, Doni will share how his team managed to save significant engineering time by simplifying benchmarking Android builds that can identify build regression from code changes with build automation tools.

Doni Winata is an Android software engineer in the Android Infra team, whose responsibilities include maintaining & improving Android’s engineering efficiency, such as optimizing build speed, building pipeline, along with overseeing code maintainability.

Background

In the previous article, we talked about how to troubleshoot memory issues and also find the optimal config for our JVM settings. To know which configuration gives the best build speed improvement, we need to benchmark Gradle build properly.

Benchmarking Gradle build can be a very tricky, tedious, and time-consuming process. Many factors like cache, warm or cold daemon, full or incremental build, and project’s size will affect build time.

In this article, we will share how we benchmark and monitor our daily build using Gradle Profiler and Gradle Enterprise. On top of that, we also integrate them on our CI/CD pipeline to save time and enable us to detect build regression on our daily builds.

What is a Gradle Profiler?

Gradle Profiler is a tool for automating benchmarking processes by repeating build scenarios to calculate build’s mean and standard error time.
The benchmark result will help us determine if there is any performance regression, for example, if our build finishes slower or faster after updating Gradle or changing JVM settings. To understand better how Gradle Profiler works, let’s try some basic usages of Gradle Profiler below.

Installing Gradle Profiler is straightforward. You can install using SDKMan, Homebrew, or download binary directly (refer to this doc for the details).

Run simple benchmark on a Gradle task

After installation, navigate to your Gradle project (called app:assembleDebug in the example below) in your terminal and run this benchmark-specific Gradle task command:

> $gradle-profiler  --benchmark  --project-dir . app:asembleDebug

Figure 1. Gradle Profiler executing benchmark build from the command line.

Gradle Profiler will run multiple builds based on a given task and record execution time from each of them. The results will be written to HTML and CSV files inside the profile-out folder. See Figure 1 above.

Figure 2. Benchmark report (HTML format).

On the report, shown in Figure 2 above, we can find the mean, median, and percentile of our benchmark build. You can also notice that Gradle Profiler runs warm-up builds to warm up daemon or cache conditions before measuring builds in multiple iterations. The first warm up build is expectedly slower, but will pick up and be more consistent after measured builds happen.

Define Scenario file

In most cases, we want to benchmark our build with other configurations to find the best one for our build. To benchmark multiple build scenarios and add advanced configuration, you can create a new file in
<root project>/benchmark.scenarios (file name and file extension can be anything) and add following two scenarios:

//<root project>/benchmark.scenariosfull_build_app {
   tasks = ["app:assembleDebug"]
   gradle-args = ["--max-workers=1", "--rerun-tasks"]
   warm-ups = 2
}full_build_app_3_worker {
   tasks = ["app:assembleDebug"]
   gradle-args = ["--max-workers=3", "--rerun-tasks"]
   warm-ups = 2
}

The example above specifies one and three max workers for each scenario. Also notice that we put a rerun-tasks flag to simulate a full build, where all Gradle tasks will be recompiled.

Run the same command as before, but replace the project’s name argument with --scenario <scenario file path> as followed:

> $gradle-profiler --benchmark  --project-dir . --scenario ./benchmark.scenarios

This time, Gradle Profiler will run those two scenarios, and run 10 builds for each of them. Gradle Profiler will then merge the benchmark results into a single report:

Figure 3. Benchmark report from multiple scenarios (HTML format).

From the report, shown in figure 3 above, we can clearly see the build speed comparison between the two scenarios from each iteration, where all metrics suggest that using the three max-workers is a faster option and could therefore save around 20% of build speed. This is a simple example to give an idea how a Gradle Profiler could help us in making benchmarking easier.

Gradle Profiler also provides some APIs to configure your build scenario, which you can read further here.

Run Gradle Profiler on CI/CD Pipeline

Our Android project has more than 300+ modules and a full build could take 15–20 minutes to complete. Running benchmark builds on local machines is time consuming. So, we came up with the idea to run Gradle Profiler on CI/CD to save time. In addition, CI/CD gives a more stable and consistent environment to run benchmarks compared to local builds.

To integrate Gradle Profiler, we create a CI/CD pipeline with Jenkins to download Gradle Profiler binary and then run a profiler based on these parameters:

Git Branch (which branch we want to benchmark)
Scenario Name (which scenario to benchmark. e.g., full_build_app or full_build_app_3_worker)
Number of Iteration (how many times we want to iterate our measured build)

Running benchmarks on CI/CD could save us a lot of time because people are now able to benchmark multiple scenarios without blocking their daily work on a local machine. They simply run a Jenkins pipeline, wait for the benchmark to complete, and then download the result.

But manually downloading the benchmark result is not so convenient and our CI/CD build can expire and remove all benchmark results in a few days. So, we need to record benchmark results somewhere outside of Jenkins, and anytime we want to review it, we can simply open a dashboard. In this case, we got help from Gradle Enterprise features that we use to monitor our daily build.

Evaluate Benchmark Result on Gradle Enterprise

All of our Gradle builds are uploaded to the Gradle Enterprise server, including benchmarking builds. Gradle Enterprise gives us the visibility to key performance metrics and trends reports of our daily build (see Figure 4 below). We can also filter subsets of our build based on requested tasks, specific user, tags, or branches.

Figure 4. Gradle Enterprise dashboard for Build Trends.

To learn more about how to extend Gradle Enterprise’s features, check this doc.

In order to filter builds from benchmark builds, we need to add a buildscan tag for scenario name and profiler phase (warm up or measured build).

First we create a new project property -PscenarioName from our scenario file:

//<root project>/benchmark.scenariosfull_build_app {
   tasks = ["app:assembleDebug"]
   gradle-args = ["-PscenarioName=full_build_app", "--max-workers=1", "--rerun-tasks", ]
   warm-ups = 2
}

Then, in our build.gradle file, we add a tag based on those properties:

//build.gradle
if (project.hasProperty(scenarioName)) {
    buildScan.tag project.scenarioName
    buildScan.tag System.getProperty("org.gradle.profiler.phase")
}

With that tag added, we can now filter our build based on scenario name and profiler phase (see Figure 5 below).

Figure 5. Filtering Build trends with tags from benchmark builds.

We also send some other tags like branch name, OS machine, and CI/CD information by using the Common Custom User Data Gradle plugin. So, after engineers run benchmarks on Jenkins, they just open the Gradle Enterprise dashboard and filter based on the benchmark id to check the result. Each build execution is recorded in Gradle’s Build Scan, which provides insight into why an execution time is slower/faster.

Benchmark Latest Development Changes

The integration of Gradle Profiler and Gradle Enterprise on our CI/CD pipeline is very useful to determine If new changes could affect our Android build speed.

Figure 6. Process comparing build speed between branches with Gradle Profiler & Gradle Enterprise.

Let’s assume we want to compare the build speed between two different release versions on different branches. Engineers will run multiple Gradle Profiler scenarios on those branches on CI/CD and the result will be uploaded to Gradle Enterprise. After all benchmarks are completed, engineers can simply filter their benchmark based on the scenario name, branch, and time when they execute the benchmark. They can later decide if there is any build regression between those two versions.

Here are four benchmark scenarios that we usually run on our pipeline:

Full Build

//<root project>/benchmark.scenarios
full_build_app {
   tasks = ["app:bundleDebug"]
   gradle-args = ["--rerun-tasks"]
   warm-ups = 1
}

Full Build scenario simulates a condition where there is no cache or up-to-date task, which means all tasks need to be recompiled. This task runs bundleDebug because we use Dynamic Feature Module and we pass --rerun-tasks to recompile all Gradle tasks. This build takes about 13 minutes to compile, so we only run warm up once and measured build 3 times.

This scenario is useful because you are able to notice if there are some issues, for example, from parallelization or bottleneck tasks, nonoptimal JVM settings, or high GC time from our full build.

Incremental Build

incremental_build {
   tasks = ["app:bundleDebug"]
   apply-abi-change-to = "library/src/main/java/com/traveloka/android/util/CommonUtil.java"
   warm-ups = 1
}

Incremental Build is only recompiling some tasks depending on code changes and ignoring other tasks that are already up-to-date/cached. In this scenario, the warm up build will run a full build and the next build will add a dummy method on a specific file based on apply-abi-change-to parameter that simulates the most common scenario of our engineers’ daily build when developing a product. This scenario helps us to know if there are some incremental build issues, for example, if the kapt compiler plugin does not work as expected or someone adds a new annotation processor that does not support incremental build.

Incremental build speed depends on which file and what kind of changes are made compared to the previous build. You can customize changes like android resources, non-abi-change, kotlin/java files from this parameter.

Clean Build

clean_build {
   tasks = ["app:bundleDebug"]
   cleanup-tasks= ["clean"]
   warm-ups = 1
}

Clean Build will run clean tasks after each iteration. On warm up build, our CI/CD will push build cache to our remote cache and subsequent builds are expected to reuse Gradle build cache. This scenario will help us to determine if there are some issues that may break our build cache reusability.

Minify/R8 Build

r8_task {
   tasks = ["app:bundleRelease"]
   apply-abi-change-to ="app/src/main/java/com/traveloka/android/util/Application.java"
   warm-ups = 1
}

On Minify/R8 Build, shrinking, obfuscation, and optimization are enabled by setting minifyEnabled to true. Each iteration will make an incremental change on the application module that recompiles only R8 task, which is a super memory hungry process and can easily catch Out of Memory Error. We decided to monitor this scenario and ensure there is no build issue on it.

For each of those scenarios, we add a scheduler on Jenkins and run it daily on our development branch. Figure 7 below explains how we monitor it daily.

Figure 7. Daily monitoring builds speed with Gradle Profiler and Gradle Enterprise.

Every working day, that scheduler will run all benchmarks on the development branch at 12 AM (when CI/CD workload is at the lowest). All the results are recorded on Gradle Enterprise and we simply add filters tag to analyze those scenarios.

Figure 8. Build trends from Benchmark Build (minified scenario).

An example, depicted in Figure 8 above, shows build trends for a Minify scenario. We noticed the build began to increase in August. When we find a build regression, we will check the corresponding build scan that causes the spike and compare it to the previous days.

We discovered that our R8 build increased because we had changed our debuggable setting to false on Android Gradle Plugin 7.2. We also found out that not only were some machines doing garbage collection at the time, but our CI/CD machine was also having some leak processes in memory.

Closing

Benchmarking Android build speed can be very challenging and time consuming for large projects. We already covered how we solved this problem by automating benchmark processes with Gradle Profiler on CI/CD pipeline and collecting the result on Gradle Enterprise. This allows us to compare changes easily and monitor our daily builds to catch any regression.

For further improvement, we could run benchmarks on Pull Request before it merges to the development branch (with the caveat of stressing CI/CD workload for large projects). If a preset build time threshold is exceeded, for example, engineers will be notified through Slack/email.

We are constantly exploring new technologies to build scalable systems. Check out Traveloka’s career page and join us on our adventure!