Illustration by Virginia Poltrack

Testing App Startup Performance

Testing launch performance can be tricky, but it doesn’t have to be

Chet Haase
Android Developers
Published in
10 min readNov 25, 2020

--

Shell Command for Testing Startup

I wrote this article to explain more about performance, startup testing, and the reasons behind the pieces I used for testing startup. But if you just want something quick here it is:

  1. Lock the clocks if possible (see far below)
  2. Run this on the command-line (while your device is connected):

The command above loops 100 times, launching an app, outputting the startup duration, and killing the process to get ready to do it all again.

Testing startup performance is… not obvious

I needed to test the startup performance for an app recently (while playing around with the Startup library to see how that affected things — more about that in a future article). And I discovered, like I have on previous strolls down this path, that testing startup performance is… not obvious.

If you’re testing a piece of runtime code, there are various ways to go about it, from the trivial “write a tight loop and calculate the delta in System.currentTimeMillis()” to something more sophisticated and useful, like the facilities provided by the AndroidX benchmark library.

But much of the process of application startup happens, by definition, before the system gets around to calling your code. So how do you figure out how long it took to get there?

I poked around in logcat, looked at some low-level APIs, and asked some of the engineers on the platform team, and found a few things to help. Better yet, I was able to use facilities in the adb shell tool to completely automate my testing and output information in a way that made it easy to pop the results into a spreadsheet and analyze them.

I’ll describe the pieces I used that went into the command above, to give you just 1–2 simple steps you can use if you need to test startup performance.

ActivityTaskManager startup log

As I wrote before in an earlier (and unfortunately outdated and incorrect) blog, there’s a handy log that the system has been issuing since the KitKat release. Whenever an activity starts, you’ll see something like this in the logcat output:

This duration (1,380ms in this example) represents the time that it took from launching the app to the time when the system consider it “launched,” which includes drawing the first frame (hence “Displayed”).

This Displayed duration does not necessarily include everything your app needs to do before it’s ready to go. You can supply that extra information to the system by calling Activity.reportFullyDrawn() whenever your app determines it’s completely done loading and initializing. If/when you call that optional method, the system issues another log with that timestamp and duration:

I only wanted the Displayed duration, so the built-in log was good enough for my purposes.

Automating startup

Performance testing should always involve running a test many times, to reduce the inherent variability in results; the more runs you do, the more faith you can put in the averaged results. I try to run tests a minimum of ten times, but doing even more is even better. Depending on how variable the results are in your situation, and how small the timings are (since variability would have larger impact on smaller durations), doing many more runs might be necessary.

Insanity is doing the same thing over and over and expecting different results.
- Albert Einstein

Performance testing Corollary:

Insanity is doing the same thing only once and expecting the results to be definitive.

- Not Albert Einstein

Tapping on an app icon to launch it many times in a row is… pretty tedious. And doing so in a predictable and consistent way (to avoid introducing variability, like if you happened to launch another app by mistake, or otherwise made the system do extra work throwing off timing results) can be a problem.

So what I really wanted was some way to launch the app from the command line. Then I could run that command to do the same thing over and over, and avoid the variability (and tediousness) of manually launching the app by hand.

adb (Android Debug Bridge, a tool probably familiar to anyone who’s read this far) provides just the thing I needed. More specifically, adb shell provides a command-line interface to launch an app: adb shell am start-activity. The command should also block until the app has finished launching, so we’ll use the -W argument as well (this is necessary for the next step, where we’ll tack on a follow-up command to kill the app post-launch). Here’s the complete launch command:

The last argument is the package+component information for your app. You can see that they are the same here as in the ActivityTaskManager log output in the section above.

Running this command launches the app (unless the app is already in the foreground, which is not what you want; we’ll deal with that next), and then outputs the following information:

Check out that TotalTime result: this turns out to be exactly the same information that we see in the log:

This means that we don’t have to troll through logcat to get this information. Instead, we can get it directly from the console where we run the launch command. Better yet, we can strip the extraneous text and leave just the launch result, making it easier to extract this data for use elsewhere.

To convert the output above to just the launch duration, I pipe the output through grep and cut shell commands (there are various ways to do this, I just picked one at random):

Now when I run this command, I get a single number as a result, as desired:

Startup Is a Dish Best Served Cold

When examining startup performance, it’s good to understand the difference between “cold start” and “warm start.”

Cold start happens when your application launches for the first time after installation, reboot, or when it is not otherwise in the background.

Warm start, on the other hand, is what you get when the application has already been launched and is running (but has been paused) in the background.

Both scenarios are valid to test and understand. But in general, testing your startup performance in a cold start scenario is a better place to start, for two main reasons:

  • Consistency: Cold start ensures that your app is going through the same set of operations every time it starts. If your app is being launched in a warm start scenario, it is not as obvious which steps are being performed and which are being skipped, so it’s not clear what you are actually timing (and whether you are testing things consistently between repeated runs).
  • Worst case: By definition, cold start is the worst-case scenario; it is the longest startup duration that your users will see. You need to keep your focus on that worst-case statistic, rather than the best-case warm start situation. You can’t fix the big problems if you ignore them.

In order to force a cold-start situation for every run, you need to kill the app between runs. Again, doing this on the screen (by, say, swiping it out of the Overview list in launcher) would be tedious and error-prone. Once again, adb shell comes to the rescue.

There are a couple of different shell commands that can be used to kill activities. The most obvious is adb shell am kill… but it doesn’t actually do the trick. After you’ve launched your application, it’s in the foreground and kill won’t kill the foreground app. Instead, you need the force-quit command:

where you use your app’s package name to tell it which app to stop.

I Like to Loop It, Loop It

Now you have commands that you can run together to launch the app, output the launch duration data, and quit the app, making it ready to launch again. You could type this in the console over and over, but hey, we’re in a shell; let’s put it in a loop and run it repeatedly with just one command.

In doing this, you want to avoid running the app too soon after it’s killed, in case there are side-effects associated with the app being torn down (such as the system pulling the Launcher to the foreground as the app is torn down). To do this, I added a second of sleep to insert a small buffer between operations.

Here’s the final command I used, which includes killing the app, waiting a second, then launching it. I looped it for 100 iterations, which provided a reasonable sample size for my situation:

When I run this, I get the launch duration output onto the console as each launch completes, which is exactly I wanted to track and analyze.

Note: There’s actually a much simpler way to loop start-activity using -S (which force-stops the activity first) and -R COUNT (which runs thestart-activity command COUNT times), so I could have used this instead:

But to ensure a buffer of inactivity between teardown and startup, I wanted that sleep 1 command, so I went with the more verbose loop approach. Besides, shell script code is sooooo elegant, right?

Lock Your Clocks… When Possible

One of the factors in mobile device performance is the CPU architecture, and the CPU frequency in particular. Specifically, one of the main ways that mobile devices preserve battery life, and avoid major problems of overheating, is by throttling the CPU speed.

CPU throttling is great for battery life, but not so great for performance testing, where consistent results are critical.

Ideally, you should have control over the CPU frequency when running performance tests. Unfortunately, your ability to do this depends on the device(s) that you have; you need to have root access to a device to control the CPU governor, which controls the frequencies, and different devices may have different ways of altering this behavior.

The following information only pertains to you if you happen to have access to a device which allows it, and for which you can get root access. On the device front, I know that Pixel devices allow this access; I can’t speak for other devices.

In any case, if you can lock your clocks, I suggest you do so. It may not matter significantly for your particular testing situation (and in fact the system generally runs the clocks at a high frequency specifically when starting up apps, so that might already provide the consistency you need). But it’s good to at least eliminate CPU frequency as a variability factor.

Unfortunately, locking CPU frequency manually can be tricky. Fortunately, the AndroidX benchmark library makes it simple. In fact, you don’t even need to write code for the benchmark APIs; you can use the library just to get the handy lockClocks and unlockClocks utilities that it provides.

First, add the benchmark library as a dependency in your project-level build.gradle file:

Next, apply the benchmark plugin by adding it to your app-level build.gradle file:

Now you can sync your project (Android Studio is probably nagging you to do this already), after which the locking tasks will be available to use from gradlew.

Now you can lock your clocks by running this on command-line (I ran it from the Terminal tool inside of Android Studio, but you can run it outside the IDE as well):

When I did this, I saw this output in the console:

That output was a pretty good indication that it worked on my Pixel 2. An even better indication was that my startup tests now took significantly longer than they did before. But wait, why are locked clocks slower?

The benchmark utility locks clocks at an easily sustainable level, not a high-performance level. If the clocks were running as high as possible instead, you might get better performance, but:

  • You want test results with realistic, or even poor, performance, like many of your users will see in the field. You do not want to only see best-case performance, which is not what people would typically see in reality.
  • Running the clocks at high frequencies for too long tends to make them overheat. I don’t know how the system would respond to this (hopefully it would down-clock them or shut down the system automatically before there were serious problems), but I don’t really want to find out.

Note that you will want to unlock the clocks when you are done with your testing. The device will unlock them on reboot, but you can also unlock them by running the opposite gradle task:

… which simply reboots the device to perform that reset.

(If you want more information on benchmark’s locking facilities, check out the user guide).

And… Done!

After locking the clocks, I had everything in place: a system that could reproduce my startup situation with reliable consistency, and a simple command-line I could execute that would return a stream of results. I was able to copy the results, paste them into a spreadsheet, and analyze them (by comparing my startup averages to various before/after situations I wanted to experiment with).

Ideally, I wouldn’t need an article to explain how to do all of this. And honestly, you don’t need all of the explanation above. (But it’s always more interesting to know how and why things work, isn’t it?) All you really need is the single for() loop shell command, along with the optional approach of locking the clocks.

  1. Lock the clocks if possible (see far below)
  2. Run this on the command-line (while your device is connected):

In the interests of simpler performance testing and analysis, and better application performance in general, the team is investigating ways to simplify this process. Stay tuned for more on that when we have anything to share. But in the meantime, I hope the commands and information above are helpful for your startup testing needs.

--

--

Chet Haase
Android Developers

Android and comedy. Not necessarily in that order.