Accelerate Your Android Development: Essential Tips to Minimize Gradle Build Time (Part II of II)

rolgalan
The Glovo Tech Blog
10 min readNov 6, 2023

Introduction

In the previous part of this article, we emphasized how reducing build time can enhance developer productivity and business value.

We highlighted that caching the output of previous tasks for reusability and leveraging parallel builds are the most impactful actions.

Let’s review now some other techniques and configurations to keep improving your build times. Even if some of these might not be as effective as the ones outlined in the first article, they are still quite relevant. After you have already applied all of the previous actions, all the new ones will become quite significant to reduce even more your build time.

As it was mentioned previously, all the learnings shared here have been acquired from Android projects, but all of the Gradle techniques discussed here can be applied to any other Gradle project unrelated with mobile.

The hardware

While it may seem obvious, upgrading the machines that build your app should be one of your first considerations to reduce build time. This means both the remote agents from your CI/CD and your local development laptop. (Are your engineers already using M2s? 👀).

Given that building an application is a CPU and memory-intensive process, it’s crucial to understand the machines on which the project runs. Number of cores is decisive to execute tasks in parallel, as well as their clock rate to execute fast. At the same time you are going to need a lot of memory available to be able to run the whole process (specially if you parallelize). We mentioned in the previous article that it is important to parallelize; if you are investing on that, it makes sense also to make sure your machines are going to support it. In the next section we’ll discuss how this parallelization impacts also the memory.

Although often overlooked, disk I/O throughput is critical as app building involves constant disk read and write operations. We learnt this the hard way!. Quite recently we detected huge penalties during a CI agents migration to different runners, specially during the Gradle task fingerprinting. The time when reusing tasks from the cache was increased by 3x when changing from Fargate to EC2 due the default disk used in the latter had worse capabilities. If you are building your projects in AWS, make sure your disk is NVMe.

While disk space may seem trivial in today’s context and often goes unmentioned, we encountered issues when using CI agents with only 20GB of disk space (this was the limit in AWS Fargate at some point). One particular thing to look at is the transitive R class, which duplicates resources in every module from its dependencies. Currently all projects have non-transitive R class by default, but if you are working with a project older than a few years, make sure to enable this flag, as it also impacts build speed.

The JVM memory settings

As previously mentioned, the build process demands a significant amount of memory, making memory configuration the most important setting for your project. Since Gradle executes in a JVM process, this should be done through the org.gradle.jvmargs property in the gradle.properties file.

By default, Gradle is setting org.gradle.jvmargs=-Xmx512m -XX:MaxMetaspaceSize=384m, which is arguably quite small for development of any Android application nowadays.

There are many things playing together here and it’s important to take all of them into account, especially if your system is constrained and you cannot have all the RAM you would like. Let’s go one by one:

The heap is the most important part, and will help you to reduce the amount spent in Garbage Collection maximizing the throughput, so make sure to set a high enough Xmx value in the org.gradle.jvmargs. At the same time, the initial heap size will also be helpful to avoid wasting some cycles dynamically increasing the heap (which requires GC to run), so you should also set up a reasonable Xms value (maybe half of your Xmx or matching it).

But you have to be careful, because the Gradle Daemon will spin up a separate process to compile the Kotlin code, the Kotlin Compiler Daemon. By default this process will inherit the jvmargs settings from the main Gradle Daemon, unless you add an extra kotlin.daemon.jvmargs Gradle property in the gradle.properties file. I recommend this, and you can probably limit it to a lower heap than the main Gradle daemon.

If you still have some Java code, Gradle will spawns separate workers for it, which used to be disposable, but since Gradle 8.3 these are promoted to long-lived daemons. Keep an eye on these as well when configuring the memory.

Please note that, by setting any value to org.gradle.jvmargs it will override the Gradle mentioned defaults, so if you increase the heap, you will lose the existing limit to the JVM Metaspace, as the JVM doesn’t have any limit by default. At some point we had issues related to the extremely huge usage of Metaspace, which was growing uncontrollably for some unknown reason and we needed to set a maximum for it with -XX:MaxMetaspaceSize to prevent it to cannibalize our available memory, but it has not been a problem recently and we do not need this setting anymore. If you are using SonarQube, it lists MetaSpace in their troubleshooting, so keep an eye on it.

Unit tests also are executed in a separate JVM process, usually one Gradle Worker for each test module. By default these test workers have a maximum of 512mb for the heap (regardless of your gradle.properties settings). If you keep your modules small, this should be enough, but you can increase the value withmaxHeapSize="1024mb" inside a test { } block in your Gradle script.

  • What is important here is that all these test workers will be launched in parallel, one per core, spawning separate JVM processes that could easily occupy a big chunk of your available memory in the machine if you have many cores. In our case we are running now with 16 cores, so this easily can add up to 16 gb just for these workers in parallel (for big applications, engineers rarely run the whole suite, so this impacts mostly the CI).
  • Is it true that you can set the property org.gradle.workers.max to limit the amount of workers executing in parallel… but why would you do that? You are paying for all of these extra cores to maximize what you can run in parallel. So use this as the last resource.
  • Worth highlighting also that maxHeapSize can be configured for the whole app or overridden in a specific module. You might have some outlier that requires much extra memory that you cannot afford having in all modules (due their parallelization), so you could set extra for this specific one and leave the rest with a smaller default.

And remember, the heap is not everything. Usually heap represents around 70% of the memory used by each java process, metaspace ~20% and the rest 10% is a bunch of different areas of the native memory not really relevant for us at this point. If you have a limited amount of available memory in your machines you will need to take this into consideration when choosing your memory settings, so you ensure there is some extra memory available for all the processes.

Update dependencies

Even it may seem obvious to some, I frequently encounter queries in public forums from individuals struggling with outdated versions of the basic tooling. However the truth is that Gradle, JDK, AGP, Kotlin… all are constantly introducing improvements in the performance, so ensuring that your dependencies are up to date is usually a good way to keep your build times under control “for free”.

One of the latest and most relevant examples is Hilt/Dagger, the most common Dependency Injection framework in Android. This is one of the top contributors to slow builds in big projects, since it makes heavy use of annotations. It’s been based on KAPT for a long time, and it was not until some weeks ago that they finally made the required changes to use KSP instead, whose performance is way faster as it doesn’t require some intermediate steps in the middle. So… are you in the latest Dagger version already?

The best you can do is to introduce any tooling to automatically update your dependencies, such as Renovatebot or Dependabot, which will regularly open PRs in your repos to keep updating to the latest versions and running all the CI checks.

Other Minor optimizations

Everything mentioned so far will introduce really noticeable improvements in your build times.

Once the major improvements are implemented, you can consider minor optimizations to further reduce build time and address edge cases. Let’s see some examples.

Pre-cache dependencies

Usually building a project requires several dependencies to be downloaded in the system. This might easily increase a couple of minutes (or more sometimes) your builds. Also it would be a quite erratic delay, as it will depend on the network variability. In general it is good having all or some of them accessible, so you should apply the advice about it for Dealing with ephemeral builds.

Naturally, this extends beyond typical build dependencies to include Gradle itself, Android SDK tools, and system images required for your Robolectric tests. These should be already pre-downloaded in the CI agents running your CI.

If you use the official gradle-build-action with GitHub Actions, this is done by default, and it even caches many other elements that will help to accelerate your builds even more (particularly many contents from the home ~/.gradle folder such as compiled build scripts, and more)..

Different settings CI and local

The structure of your CI builds, and how the tasks are influences this. In our case, we aim to validate multiple aspects in each PR, including unit tests, linting, release builds, and UI test artifacts (even if not executed). This allows us to ensure that PR changes do not impact other necessary tasks in different stages.

Since we had some powerful CI agents with many cores, we previously launched a single execution that requested all tasks simultaneously, allowing Gradle to parallelize everything with its internal workers. This has completely different memory requirements from the day to day of engineers in local builds, that usually just launches tasks one by one. For this reason we were tweaking the memory settings for the CI, overriding the jvmArgs and other gradle.properties in our CI agents.

Remember that anything declared in your home directory (~/.gradle/gradle.properties) will override the project settings, facilitating easy modification of the configuration for many of many other settings mentioned in this section.

Avoid daemons duplication from the IDE

If the JAVA_HOME environment variable is different from the IntelliJ IDEA (or Android Studio) JVM settings, and you usually run tasks both from the IDE and the terminal, it might duplicate the Gradle Daemon, consuming extra CPU and memory. Make sure you configure both to be the same.

There is a nice plugin maintained by Gradle engineers that will flag this misconfiguration and other minor optimizations. Check the Gradle Doctor plugin.

Stop watching file system in CI

Gradle has a nice feature that significantly accelerates incremental builds (which is enabled by default). When enabled, it allows Gradle to keep what it has learned about the file system in memory between builds instead of polling the file system on each build, reducing the amount of disk I/O needed between builds. You should have this enabled for your local development.

However, this is probably overkill for the CI as nothing is expected to change (and probably you only run a single execution), so you can disable it explicitly by adding org.gradle.vfs.watch=false to your gradle.properties. Make sure you disable this only for the CI.

We haven’t quantified the impact of this (as we applied many other changes at the time), but intuitively, this setting seems unnecessary in the CI. I would love to hear from anyone having some data around the imipact of this setting.

Garbage collector

Since the release of JDK 9 G1 is the default garbage collector, however Android documentation encourages you to use the ParallelGC instead. In most cases this might not be a big difference, but in others it might be huge.

For reasons still unclear to us, some complex clean builds targettign several tasks at once resulted in GC Overhead or took over an hour with the ParallelGC, despite allocating substantial extra memory to the heap. However, we managed to reduce this to around 35 minutes simply by switching to the G1 collector.

So I am not telling you to change your garbage collector, but to encourage you to test different settings. Then, do not hesitate to try this (or other newer GCs) if you’re having memory issues, as it might help, even if the reasons why are not clear.

Fork test execution

I mentioned at the beginning that Gradle runs a separate parallel JVM process for each module when running the test suite. You can also execute the tests of the same module in parallel inside this process.

In our experience, this approach penalized our CI builds but benefited local builds, likely because CI executes everything simultaneously using all available resources, while engineers typically run tests for a single module, which is a lighter task, leaving some computer cores available.

Experiment

There are numerous other minor aspects that you should test and measure to ensure they suit your needs. The last few points are good examples of settings that you can evaluate to decide if they are making any improvements for you. The build process is quite complex and depends on so many different pieces that some configurations need to be tested before making a decision.

Do not hesitate to experiment with different settings in order to fine tune your build configuration and keep reducing the build time. The Gradle Profiler is a really good tool that could help you with this.

Conclusion

As demonstrated, there are numerous strategies you can employ to enhance your build times. There might be many others, but these are the most relevant ones that helped the Mobile Platform team to significantly reduce the build times both locally and in the CI for the Glovo mobile apps.

The most important ones are to introduce a remote cache (specially for your CI), to ensure that the cacheability of the tasks works correctly, making sure that you leverage parallelization correctly, and having the right hardware and memory settings. Review the first part of this article for more details about cacheability and parallelization.

After addressing the major improvements, begin exploring other strategies to further reduce your build time, and conduct regular reviews to ensure your build times remain optimal.

Last, but not least, make sure to keep monitoring your build times to ensure there is no degradation over time and to quickly catch any abnormal increase due to some misconfiguration or other changes in your projects.

--

--