Illustration by Claudia Sanchez

Profile guided optimization for native Android applications

Android Developers
Android Developers
Published in
3 min readJul 15, 2020

--

Posted by Pirama Arumuga Nainar, Software Engineer

Profile-guided optimization (PGO) is a well known compiler optimization technique. In PGO, runtime profiles from a program’s executions are used by the compiler to make optimal choices about inlining and code layout. This leads to improved performance and reduced code size. Developers can now leverage Google’s toolkit to easily deploy PGO tools and improve their native Android apps.

On selected Android system components, enabling PGO improved performance by 6–8%. PGO also provided code-size improvements in one component while slightly increasing the code size of the other two components.

Benefits of PGO for Android system components

PGO can be deployed to your application or library with the following steps:

  1. Identify a representative workload.
  2. Collect profiles.
  3. Use the profiles in a Release build.

Step 1: Identify a Representative Workload

First, identify a representative benchmark or workload for your application. This is a critical step as the profiles collected from the workload identify the hot and cold regions in the code. When using the profiles, the compiler will perform aggressive optimizations and inlining in the hot regions. The compiler may also choose to reduce the code size of cold regions while trading off performance.

Identifying a good workload is also beneficial to keep track of performance in general.

Step 2: Collect Profiles

The profiles are collected by running the workload from step 1 on an instrumented build of the application. To generate an instrumented build, add -fprofile-generate to the compiler and linker flags. This flag should be controlled by a separate build variable since the flag is not needed during a default build.

Profiles get collected when the instrumented binary is run and get written to a file at exit. However, functions registered with atexit are not called in an Android app — the app just gets killed. The application/workload has to explicitly trigger a profile write by calling the __llvm_profile_write_file function.

Example for triggering profile write at end of workload

Writing the profile file is simpler if the workload is a standalone binary — just set the LLVM_PROFILE_FILE environment variable before running the binary.

The profile files are in the .profraw format. Use the llvm-profdata utility in the NDK to convert from .profraw to .profdata, which can then be passed to the compiler.

Command to convert .profraw files to .profdata

Use the llvm-profdata and clang from the same NDK release to avoid version mismatch of the profile file formats.

Step 3 Use the Profiles to Build Application

Use the profile from the previous step during a release build of your application by passing -fprofile-use=<>.profdata to the compiler and linker. The profiles can be used even as the code evolves — the Clang compiler can tolerate slight mismatch between the source and the profiles.

Case Study

dex2oat” is Android’s on-device AOT compiler. To get a representative workload for dex2oat, we randomly selected 25 of the top 100 most-installed apps in the Play store. We also randomly generated dex2oat’s compilation options.

To generate PGO profiles, we built a PGO-instrumented dex2oat binary and used it to compile the workload. We then generated a release-build of dex2oat that uses these PGO profiles and evaluated performance gains on the remaining 75 of the 100 most-installed apps.

We leveraged the test infrastructure available to the Android team to automate the collection of these PGO profiles so they can be easily kept up-to-date.

Conclusion

PGO is a very useful performance optimization technique. After an initial setup of workloads and integration in the build process, it delivers impressive performance improvements with minimal upkeep.

Here are a few other topics that can help improve performance of Android apps:

  1. Link-time Optimization: LTO + PGO is better than each individually.
  2. Cloud Profiles for Java apps

--

--

Android Developers
Android Developers

News and announcements for developers from the Android team.