Jetpack Microbenchmark: Code Performance Testing

I’ll explain how the Microbenchmark library works and show examples of how to use it. Perhaps this will help you evaluate performance and solve controversial situations during the code review.

If you need to check the execution time of the code, then the first thing that comes to mind looks something like this:

This approach is simple, but it has several drawbacks:

  • It does not consider the “warm-up” of the code under study.
  • It does not take into account the state of the device, for example, Thermal Throttling.
  • It produces only one result, with no idea of the variance in execution time.
  • This can make it more challenging to isolate the code under test.

Therefore, estimating the execution time is not as trivial as it may seem at first glance. There is a solution, for example, Firebase Performance Monitoring. Still, it is more suitable for monitoring performance in a production environment and is not ideal for isolated parts of the code.

Google’s library is better able to do this.

What is Microbenchmark

Microbenchmark is a library from Jetpack that allows you to estimate the execution time of Kotlin and Java code quickly. It can, to some extent, rid the final result of the influence of warm-up, throttling, and other factors, and it can also generate reports in the console or JSON file. Also, this tool can be used with CI, allowing you to identify performance problems already at the initial stages.

This library provides the best results when profiling code that is used repeatedly. Good examples would be RecyclerView scrolling, data conversion, and so on.

It is also advisable to exclude the influence of the cache if any. This can be done by generating unique data before each run. In addition, performance tests require specific settings (for example, debuggable disabled), so the right solution is to put them in a separate module.

How Microbenchmark Works

Let’s see how the library works.

All benchmarks run inside the IsolationActivity (the AndroidBenchmarkRunner class is responsible for the first launch), where the initial configuration occurs.

It consists of the following steps:

  1. Availability of other Activities with the test. In case of duplication, the test will fail with the following error: Only one IsolationActivity should exist.
  2. Check the Sustained Mode support. This is a mode in which the device can maintain a constant performance level, which has a good effect on the consistency of the results.
  3. In parallel with the test, the BenchSpinThread process starts with THREAD_PRIORITY_LOWEST. This is done so that at least one core is constantly loaded. This approach only works in combination with Sustained Mode.

In general terms, the job of a benchmark is to run code from a test a number of times and measure the average time it takes to run. But there are certain subtleties. For example, with this approach, the first launches will take several times more time. This is because there may be a dependency in the code under test that spends a lot of time initializing. In some ways, this is similar to the engine of a car, which needs some time to warm up.

Before the control runs, you need to make sure that everything is working normally, and that the warm-up is completed. In the library code, the end of warm-up is the state when the next run of the test gives a result within a certain error range.

All the basic logic is contained in the WarmupManager class, and that’s where all the magic comes in. The onNextIteration method contains the logic to determine whether the benchmark is stable. The fastMovingAvg and slowMovingAvg variables store average benchmark run-time values that converge to the mean with some error (the error is stored inside the TRESHOLD constant).

In addition to warming up the code, Thermal Throttling detection is implemented inside the library. You should not allow this state to affect your tests because throttling increases the average execution time.

Detecting overheating is much easier than the WarmupManager. The isDeviceThermalThrottled method checks the execution time of a small test function within this class. Namely, the time of copying a small ByteArray is measured.

The above data is used when running basic tests. It helps to exclude runs for warm-up and those affected by throttling (if any). By default, 50 significant runs are performed, and if desired, this number and other constants are easily changed to the necessary ones. But you need to be careful, as this can greatly affect the operation of the library.

A Little Practice

Let’s try to work with the library as ordinary users. Let’s test the JSON read and write speed for GSON and Kotlin Serialization.

To evaluate the test results, you can use the console in Android Studio or generate a report in a JSON file. Moreover, the details of the report in the console and the file are very different: in the first case, you can only find out the average execution time, and in the second you will get a full report indicating the time of each run (useful for plotting graphs) and other information.

Setting up reports is located in the edit rune configuration window > extra params instrumentation. The parameter that is responsible for saving reports is called androidx.benchmark.output.enable. Additionally, here you can configure the import of values from Gradle, which will be useful when running on CI.

Settings for Running Performance Tests with Reports Enabled

From now on, when you run tests, reports will be saved to the application directory, and the file name will correspond to the class name. You can see an example of the report structure here.

Conclusion

At Funcorp, this tool was used to find the best solution among JSON parsers. In the end, Kotlin Serialization won. At the same time, we really missed profiling for CPU and memory consumption during testing, we had to receive this data separately.

It may seem that the tool has insufficient functionality, and also that its capabilities are limited, and the scope of application is very specific. In general, it is, but in some cases, it can be very useful. Here are a few cases:

  • Evaluating the performance of the new library in your project.
  • Resolution of disputable situations during the code review, when it is necessary to justify the choice in favor of a particular decision.
  • Collecting statistics and evaluating code quality over a long period of time when integrating with CI.

Microbenchmark also has an older sibling called Macrobenchmark, which is designed to evaluate UI operations, such as app launches, scrolling, and animations. However, that’s a topic for a separate article.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
FunCorp

FunCorp

166 Followers

Since 2004, we’ve been doing our part in changing the entertainment & tech space. Learn more at Fun.co