Identify & Handle Android Builds’ Memory Issues

Doni Winata
Traveloka Engineering Blog
16 min readSep 29, 2021

Editor’s Note:

As software projects’ requirements & dependencies increase, encountering memory problems as an Android (or any) developer, is a matter of when, not if . In this article, Doni will share some exploration, optimization, & rectification techniques you can use to prepare for (and face) the inevitable.

Doni Winata is an Android Software Engineer with the Android Infra team who together maintain & improve Android’s engineering efficiency as well as agility such as optimizing build speed, building pipeline, along with overseeing code maintainability.

During an Android developer’s journey, projects would get larger and take progressively longer to build. Sometimes, builds could halt and eventually fail because of an out-of-memory (OOM) error.

In response, the developer would naturally keep increasing the Gradle’s build’s “maximum” heap size (with the -xmx argument) to keep up with the ever-increasing memory footprint. But, in hindsight, it does not solve the problem. The computer becomes less responsive and eventually freezes. Constantly.

If this scenario sounds familiar to you, this article may help identify your build’s actual underlying memory problem.

Profiling Memory Usage

To determine the underlying problem, we need to use VisualVM, open-source software, to profile your JVM (Java Virtual Machine) activities and build by visualizing CPU and memory usages along with garbage collection activities. You can also install VisualGC Plugin as a VisualVM addon to collect and graphically display garbage collection, class loader, and HotSpot compiler performance data.

To add VisualGC Plugin to VisualVM:

  1. Run VisualVM.
  2. Click ‘Tools’ > ‘Plugins’ > ‘Available Plugins’.
  3. Check ‘VisualGC’ on the list.
  4. Wait for the installation to complete.
  5. Relaunch VisualVM.

Review JVM / Daemon Instances

It is important to know how many resources are being used by a single build. A Gradle build being executed inside a JVM is called a daemon. Every daemon uses memory and CPU resources that will impact your build performance. To view these running daemons, open VisualVM while (or after) building your Android project and you will see the following JVM instances on the left pane as can be seen in Figure 1 below:

Figure 1. A list of running daemons or JVM instances during a build.

Let me add additional contexts to each of those running daemons (in the order displayed):

  1. AndroidStudio: Daemon used by the Android Studio (JetBrains). This daemon does not execute your Gradle build but handles processes used by the Android Studio activities such as indexing, code completion, and code analysis. If your Android Studio seems sluggish, you can try configuring its JVM setting.
  2. GradleDaemon: This is the main JVM / daemon that executes Gradle tasks to build your application. Most of the time, we will profile this daemon to fix the memory issue.
  3. GradleWrapperMain: It is a process spawned by the script to utilize Gradle projects that mostly consumes very little memory (< 100MB).
  4. KotlinCompileDaemon: Used by the Kotlin compiler (if using Kotlin/KAPT on the project) on an Android project. This daemon also has impact on build performance.

We can also see the daemons’ processes in the MacOS’ Activity Monitor (or Windows’ Task Manager), as can be seen in Figure. 2 below:

Figure 2. The three related processes in MacOS’ Activity Monitor.

Each of those `Java` processes represents the Gradle and KotlinCompile daemon instances. As we can see, each consumes a huge amount of memory and CPU usage. Allocating a large amount of memory for those daemons can be a problem because other apps like Android Studio and Chrome browsers also usually consume a large amount of memory. On the other hand, allocating small memory can lead to Out of Memory errors in your build. Below are two recommendations you can do to tackle this problem.

Avoid Multiple Instances of the same JVM/Daemon

Ideally, each Gradle build only uses a single instance of GradleDaemon and KotlinCompileDaemon. Those two daemons will stay in memory to anticipate the next build and will be killed automatically after 3 hours of inactivity. Reusing those daemons for subsequent builds is necessary because 1) spawning a new daemon is costly (the OS needs to calculate & allocate enough space in RAM to load JVM data & classes from hard disk, run JIT compiler, and so on) and 2) reusing cached JVM from previous build is a time & resources saver.

Few factors may also cause daemons not to be reusable and end up spawning duplicate daemons simultaneously that reserve an unnecessarily huge amount of memory, slowing down the machine, and significantly reducing build performance. Here’s an example when multiple daemons run for the same project (Figure 3):

Figure 3. Multiple Instances of the same daemon

To figure out if those instances are the same daemon, you can click the JVM instance and see the Overview tab (Figure 4):

Figure 4. The Overview tab shows the details of GradleDaemon.

This tab contains detailed information about a daemon, including if it belongs to the same project or not. A project may spawn multiple daemon instances if they have different properties (JDK version, Android Gradle Plugin, Gradle version, or JVM argument being used from the previous build).

On Android build, this issue mostly happens when Android Studio and terminal build using a different daemon. The solution is to set both of them to use the same JDK path. In this case, It is recommended to use Embedded Android Studio SDK to avoid a bug on JDK 8 that prevents Room Incremental Annotation processor.

Follow these steps to change your terminal’s JDK path to the Android Studio’s version:

  1. In Android Studio, go to File → Project Structure → SDK Location (Figure 5).
Figure 5. Configure JDK path from Android Studio

2. Under JDK location (Figure 5), copy the specified path: /Applications/Android Studio.app/Contents/jre/jdk/Contents/Home

3. Setup Environment variable for JAVA_HOME to use JDK path from android studio. For Mac OS user, open your terminal app & open its shell’s config file by typing: nano ~/.bash_profile.

4. On the top of the.bash_profile file, state the following assignment declaration using the copied path from step 2 above as such: export JAVA_HOME=’/Applications/Android Studio.app/Contents/jre/jdk/Contents/Home’

5. Save the .bash_profile file and restart your terminal (or open a new tab)

6. Type echo $JAVA_HOME on your terminal prompt and you should see the path you’ve just set.

You can now run a build from terminal and Android Studio to verify that the build uses the same daemon/JVM by finding the JDK information in the Overview tab. If the daemon instance from the old settings still exists, you can run this command to kill all daemons (then rerun build from the new config after this):

Forked Daemon

Figure 6. Forked daemon example

In some cases, our build may spawn forked daemons similar to the image in Figure 6 above. Forked daemon/JVM will execute the task on the child process separated from the main daemon to reduce the workload/memory usage from the main daemon. Unlike other daemons, such as Android Studio daemon, Gradle wrapper daemon, KotlinDaemon, or mainDaemon, forked daemon will be removed immediately after a build finishes.

To use the forked daemon, such as moving tasks to it, allocating its memory size, or setting the numbers of parallel processes, you can add the following codes in your build.gradle file:

The maximum number of forked daemon instances in a single build depends on the value specified for max-workers variable in the gradle.properties file. For example, six max workers may allow up to five forked daemons simultaneously working in parallel. However, be aware there are some pros and cons when using a forked daemon:

Pros:

  1. It executes JavaCompile on different daemons to reduce overhead, resulting in less garbage collection on the main daemon.
  2. It will disappear after the task is completed. So, if javaCompile has a memory leak, it will not affect the main daemon for subsequent builds.

Cons:

  1. It takes time to start a new daemon worker and probably some processes cannot be shared or isolated, which may badly impact overall build performance.
  2. It will take an additional amount of memory. If you assign many max workers, more daemon instances would run in parallel and occupy a lot of memory and CPU resources in parallel.

For unit tests, the tasks will be executed on a separate forked daemon. You can increase the number of max forks for it by passing this argument:

Please note that not all builds would benefit from forked daemon. You should always profile and benchmark your build routine first to evaluate its usefulness. At Traveloka, it helped us to avoid OOM errors. But, since the library owner has fixed their memory leak issue, we’ve disabled the forked daemon on normal builds. Currently, only our unit test uses forked daemon.

Reduce GC Time on the JVM

Too much garbage collection (GC) will stall program execution and slow down your Gradle build significantly. The most common solution to reduce GC time is to increase your max heap size (-xmx value in the gradle.properties file). However, if you are working on a large project with lots of modules or classes, or even a small project, which spawns a daemon that happens to have a memory leak, a large heap size may not solve the problem. It can make it worse. More heap size means more objects need to be wiped out during a major GC process. Machines with minimum/limited memory/RAM will suffer from this, especially with opening other memory-intensive programs like Android Studio or Chrome, that will force the OS to constantly use swap memory (that resides in the hard drive) and slow the build (along with other computing activities) significantly.

So, how do you find out if your build spends so much time on GC?

These are three steps that we use to identify memory issues from our builds.

Monitor Buildscans

At Traveloka, we use Gradle Enterprise to monitor builds from engineers and CI. From each Buildscan, we can quickly check GC time on the Performance tab:

Figure 7. Build performance stats

From this tab, we can see that it takes just a little over 1 minute to do garbage collection. Please note that this is only the garbage collection from the Main GradleDaemon (Gradle Buildscan doesn’t provide Kotlin daemon stats yet). Ideally, lower GC time is better. But, if a GC takes more than 5% of build time (for example, if GC takes 3m out of a 17m build), we will continue to investigate our memory for problems with visualGC (VisualVM plugin).

Deeper Analysis with VisualGC

After we know there are some problems with memory on specific build types, we use the VisualGC plugin to profile the builds. VisualGC will tell us exactly when the JVM starts to run GC, so we know which task uses extensive memory on every build. It also provides more details about GC activities and total time on old space (Major GC) and young generation space (Minor GC). This information will help us to focus on which task causes the extended GC process. Some of the most common culprits are Gradle, Android Gradle plugin, or specific annotation processor library. If potentially a bug, then we can report it to the issue tracker. If deeper analysis is required, we can generate a heap dump to find the dominator’s object.

Profiling .hprof File to Find Dominator Objects

It is important to know the biggest objects (dominator objects) that retain most memory by dumping the JVM and generating a .hprof file (right-click on a daemon & click Heap Dump. See Figure 8 below).

Figure 8. Generate Heap Dump from a daemon

I also recommend Eclipse Memory Analyzer (MAT) to find memory leaks, dominator objects, or heap-dominating annotation processors / processes.

In the next section, we will explain how we use the tools to identify memory issues from our build.

Demos…

Now, let’s go through an example of how we reduce our GC time and solve some memory issues.

Find Optimal Value for Max Heap Size (-xmx)

When building your project, open the Monitor tab on GradleDaemon or KotlinCompileDaemon:

Figure 9. VisualVM Graph shows CPU and GC time during build

The graph shows CPU usage (yellow) and GC time (blue) when a task is executed and whether it is executed on GradleDaemon or KotlinCompileDaemon. For example, when the Kotlin Annotation Processor specific task (KAPT) runs and the graph from the KotlinCompileDaemon starts to spike, then we know that the task gets executed inside KotlinCompileDaemon. It is important to know where your task is being executed and if it spends too much time on GC.

Assume if we set 5GB for the -xmx value on gradle.properties file:

In this case, each GradleDaemon and KotlinCompileDaemon will have a 5GB of max heap size for a total of 10GB. In some cases, KotlinCompileDaemon doesn’t need that much heap size. If you did enable Gradle’s flag kapt.use.worker.api=true, most of Kotlin processes will move to GradleDaemon. You will notice from the VisualVM that most CPU activities will also be moved to GradleDaemon, relieving KotlinCompileDaemon from needing more memory than it should. Therefore, let’s reduce the heap size just for KotlinCompileDaemon from 5GB to 2GB (total max heap size for both daemons is now 7GB):

To determine whether the xmx value is optimal, you can analyze the VisualVM graph:

Figure 10. Heap size of the daemon

The graph shows the heap conditions when building a sample project, where we can see the heap usage between 1.5GB to 3.5 GB from a 6GB of maximum heap. After the build finishes, the heap’s memory footprint will shrink to below 1GB. Since memory usage never exceeds 3.5GB, we can reduce max heap space to4GB instead of 6GB. But before doing this, you also need to consider different build types like R8 (Minified enabled) that will require more heap size when enabled.

Specify -xms Value (If Needed)

In Figure 10, you can see the heap size (the orange area) dynamically changes (a process called Heap Expansion that might affect your build performance) depending on the heap used size (the blue area). Here isthe result after we specify -xms to use the same amount of -xmx:

Figure 11. Heap size with -xms 6GB

As we can see, the heap size (the orange area) remains flat from start to end of the build, which means no additional cost, such as time or resources, for heap expansion that may improve your build performance.

Inspect GC Activity with GC Plugin

GC activities significantly affect build performance and it is hard to figure out why it’s such a lengthy process. To check the details of GC activities, we can use a plugin from VisualVM called Visual GC.

Figure 12. Visual GC plugin

This plugin helps us to inspect Garbage Collection activity from your build. You can check Garbage Collection in Java to have context howJVM hotspot Garbage Collection works.

Total GC time = Eden space + Old Gen GC (Minor GC + Major GC)

From figure 12, the total GC time is 1m 49s (row three on the graph section) which comes from 24s of Minor GC(row four) + 1m 25s of Major GC(row seven). Major GC occurs when old space is full, while Minor GC occurs when Eden space is full. Eden space and Old space max size itself defined from JVM arguments -xmx parameters. The value of those graphs change in real time, so it helps us know exactly at which process/task causes a spike in our memory. Ideally, after the build finishes, we can see that the majority of space will be empty again. Otherwise, that indicate that the heap contains memory leaks that badly affect the next build.

Identify Memory Leak on the Compiler

Memory leak on the compiler is very costly for your build. The leaked object cannot be GC-ed and causes higher memory pressure and eventually may lead to Out of Memory error.
For Android builds, leaks usually come from the Gradle Compiler, Kotlin Annotation Processor (KAPT), Android Gradle Plugin (AGP), script or 3rd party Gradle Plugins and annotation processor libraries that we use.

To understand in detail how we identify memory leaks, let’s take a look at the following case:

Parceler Leak

One of our annotation libraries caused a leak on the main daemon. It held a lot of objects from the Parceler library that GC cannot remove. To identify this issue, we generated a memory heap dump while building the project (before the build could finish).

The steps we used to reproduce the issue:

1. Kill the daemon: ./gradlew — stop

2. Run a full build: ./gradlew app:assembleDebug — rerun-tasks

3. Before the build could finish (when running the application module), right-click on the daemon instance and choose heap dump. It should produce a .hprof file. Do this for Gradle and Kotlin Daemon.

4. Open the .hprof file from MAT and run the Leak Suspect Report

5. You can see the leak suspect report if it has a memory leak and report to the issue tracker or the library owner.

The MAT analyzer will tell us the leak suspect and how large the instance occupies the daemon’s heap size.

Figure 13. Memory leaks Report from MAT analyzer

In this case, the leak happened because the Transfuse library (dependencies injection used by the Parceler compiler) holds a huge static object from the generated Parceler object. Figure 14 shows the impact of this leak on the heap size:

Figure 14. The heap size on parceler 1.1.12 that contains Memory Leak

After 10:30 AM, we can see the heap size(the orange area) increase significantly until it reaches the max heap size at 5GB around 10:36 AM. The major GC kicks in at this point and can reduce memory to 2.5GB (the blue area). Afterward, some major GC keeps happening because the remaining free heap keeps getting smaller. From GC plugins (Figure 15), we can see there are major GC (row seven) and 612 minor GC (row four) totaling 3m9s (row three) during that build.

Figure 15. GC stats on parceler 1.1.12

Let’s compare the graph after the library authors fix the issue on Parceler 1.1.13:

Figure 16. The heap size on parceler 1.1.13 after fixing a memory leak

The GC process can reclaim most of the memory and keeps the heap size between 1.5–3GB size, which makes sense now because it removes around 2GB objects that leaked before. The total GC was reduced from 3m9s to 1m39s and a huge improvement from 5 to 1 major GC and 612 to 510 minor GC.

Figure 17. GC stats on Parceler 1.1.13

Dagger requestKind leak & Google service

We have also found a memory leak on our build because of the Dagger library. This leak is different from the Parceler leak because the object cannot be reclaimed after the build finishes. To identify such cases, you need to run it multiple times and dump the memory after the build finishes.

1. Kill the daemon ./gradlew — stop

2. Run a full build e.g: ./gradlew app:assembleDebug — rerun-tasks, run it multiple times.

3. After the build finishes, right-click on the daemon instance and choose heap dump. It should produce a .hprof file. Do the same for Gradle and Kotlin Daemon.

4. Open the .hprof file from MAT and run leak suspect report

5. You can see the leak suspect report if it has a memory leak and report to the issue tracker or library owner.

Figure 18. Memory Leak on Dagger object

In this case, the leak will accumulate on every build (depending on which task is being executed). So those 3.5GB leaks accumulated from 5 consecutive builds on the same daemon that runs Dagger. Eventually, at some point you may catch OOM build or daemon will be stuck because the existing space is already full (the green bar on the left side in Figure 19):

Figure 19. Old gen spaces unable to reclaim memory

At this point, the build will freeze for a very long time trying to reclaim the memory. After some time, it will throw an Out-Of-Memory error & the build will fail. Most memory leak problems are not easy to solve. Make sure to create an issue and report it to the library owner.

In case you get an unsolvable leak, the following tricks may help to reduce the memory usage on your build:

  1. Enable G1GC as a GC algorithm (If you use JDK 11, G1GC is enabled by default). G1GC can be enabled by passing it from the JVM argument on gradle.properties file. In my personal experience, G1GC will perform better on unstable daemons or under high memory pressure. It may be slower than the default one, but it can be a temporary solution if you find your build always stuck when the heap size is always full. You can apply this for both Gradle main and Kotlin daemon.
  2. More workers in parallel will consume more memories. Consider reducing your max worker if the daemon is spending too much time on GC.
  3. Fork daemon on java compilation may reduce the overhead on the main Gradle. By profiling it properly, you may consider forking the main daemon if you can see improvement in your build speed.
  4. Consider reducing the maximum heap size of KotlinCompile daemon. By default, it uses the same max heap size as Gradle’s main daemon. But In most cases, it only uses less memory it needs.
  5. Review the usage of your Annotation libraries, Gradle plugin, Android Gradle Plugin (AGP) versions, Gradle, and Kotlin. You also need to follow the update on the release notes and its issue tracker.

Closing

Memory issue is one of the most common problems in a relatively large Android project. It becomes very complex to fix with common methods we found on the internet (for example, setting max heap size) because each project has a different code size, library, architecture, and machine specs. We need to understand & identify the root cause as well as go through trial and error to find the optimal configuration.

We have covered some methods we use to identify and optimize memory usage. It helps us to anticipate build speed regression because of memory issues when updating compiler and annotation libraries. Memory problem is only one of the issues we have when optimizing our build speed. Please look forward to the second part of this article covering how we improve our Android build speed at Traveloka.

We are constantly exploring new technologies to build scalable systems. Check out Traveloka’s career page and join us on our adventure!

--

--