Practical Android Profiling

Derek Gelormini
WW Tech Blog
Published in
9 min readMar 28, 2022

--

Photo by Claude Gabriel on Unsplash

Intro

This article is intended to provide beginners with some high-level knowledge of how to diagnose and resolve common performance issues using the standard tools available to all Android developers. It assumes some knowledge of things like threads, memory, and garbage collection, but is not intended to be a deep dive into any of these areas.

Tools for Profiling

While there are a number of third party tools, libraries, and services available for performance monitoring, we will mainly rely on the tools provided by Android Studio for this article. The rest of this article assumes basic knowledge of the Android Studio Profiler — I’d highly recommend taking a look at the official documentation for an overview if you haven’t used it before.

The rest of this article will focus on common issues I’ve encountered in production apps and what I typically do to troubleshoot them. Without further ado, let’s take a look at some common issues!

Janky/Slow UI

Are you tapping a button, and noticing that it takes a considerable amount of time to bring up the next screen or UI element? Are you seeing messages like I/Choreographer: Skipped 115 frames! The application may be doing too much work on its main thread. in Logcat?

Suggestion:

  1. Use the CPU profiler to record a trace for the duration of the issue.
    I personally prefer the Java/Kotlin Method Sample Recording (legacy) option when recording a trace on more recent Android Studio versions.
    I’d recommend saving the trace file after it’s captured — you can share it with other team members or re-import it if Android Studio happens to crash.
  2. Analyze the trace and look for methods taking a long relative time to complete (that is, look at percentages rather than absolute time).
    Wider bars indicate more time spent in the associated method, so focusing on those is usually a good starting point.
  3. Filter the output based on method or package name to quickly identify code you own rather than framework/third-party code. While it may very well be a third party library causing problems, you have the most control over code you own, so starting there is usually a good choice.
    I like to first look at the main thread, then view the “Flame Chart” to get a rough idea of where the time-consuming work is happening. It is then useful to switch between the “Top Down” and “Bottom Up” views in order to help pinpoint the problematic areas.
  4. Refactor, optimize, repeat. This often involves moving CPU-intensive work to a background thread, but it could also mean rearranging code such that any expensive initialization happens lazily rather than each time a method is invoked.
Main thread CPU trace and Flame Chart

In the screenshot above, I’ve selected a time slice from the main thread (left) and opened the Flame Chart view (right) for that time period. The Flame Chart shows an aggregated view of the CPU time spent in various methods, so in this case, you can see that a large percentage of the time here involved Picasso (third party), a CoachItem (our code), and the Palette util (Android framework).
Drilling into the CoachItem code (right-clicking on a bar and selecting “Jump to Source”) reveals that this is actually a RecyclerView item which loads an image, then uses the Palette utility to overlay a color from the loaded image. In this instance, we can see that the Palette API actually provides an “async” option that our code is not utilizing. Switching to the async version would be my first optimization here, and I would then re-profile with those changes to see the impact.

If this method doesn’t resolve your problem, try another suggestion. There is some overlap in these types of problems.

App Takes a Long Time to Start

Does your app’s splash screen stay visible for a long time and you’re not sure why?

Suggestion: The first thing to try would be to profile your app on startup. To do so, you can edit your Run Configuration settings and make sure Start this recording on startup is checked (under the Profiling tab), then run the app using the Profile toolbar icon. This will allow the profiler to connect to your app as early as possible so you don’t miss anything. Once your app has gotten past the splash screen (for example), you can stop the profiler and follow the steps laid out previously for Janky/Slow UI.

Run configuration settings for profiling CPU usage at startup.
The right-most icon is the Profile option to use.

Another option is to instrument your app in order to generate traces for specific sections of code. The nice thing about this option is that you have full control over when the trace begins and ends. On the other hand, it may involve more trial and error, and can become time consuming if your build times are slow (not to mention forgetting to remove the trace code when you’re done).

The “fix” for slow startup times largely depends on your codebase, but identifying the problematic code is key to improving it. You may achieve some startup speed improvements by switching to the app startup library and implementing baseline profiles, however deferring expensive initialization code will likely make the most noticeable difference.

App Freezes and Is Killed (ANR).

Is your app freezing and showing ANR dialogs under certain circumstances? Have you noticed an increase in ANRs under Android Vitals > Crashes and ANRs in the Play Console?

Suggestion: You may need to try a few things. It’s definitely easier if you can reproduce these problems locally.

First, if you didn’t have Logcat available at the time of the ANR, and you didn’t catch the crash, you may be able to recover some details from the system when you connect to your device via adb shell dumpsys dropbox. If you see an entry like data_app_crash and your package name, you can often view the stack trace by using the --print option (e.g. adb shell dumpsys dropbox --print 2021–07–09 10:39:15).

Example of dropbox entries. If it says “(contents lost),” you won’t be able to see data for that entry.

If you were able to pull the log from dropbox, logs, or elsewhere, look for references to your package name. This will help you figure out what was happening with your code at the time of the crash.

If the ANR is reproducible on debuggable builds, you can leverage the debugger to help you out.

First, connect the debugger and then perform whichever actions are necessary to get your app to freeze. Once frozen, perform a thread dump from the Debugger pane (that’s the Camera icon).

Thread Dump option in the Debug window

This will essentially take a snapshot of all the threads so you can see what lines of code are running at any given time. If your app is frozen, there’s a good chance you’ll see some questionable lines of code highlighted there. Investigate those lines of code and look for any funny business (e.g. blocking calls, semaphores, mutexes, etc). Repeat for additional threads.

Another option is just to set breakpoints and step through your code until you see problems — also keep an eye on other threads because they can be causing deadlock with the main thread.

We recently ran into an instance where a coroutine’s runBlocking{}and an RxJava blockingGet() led to an ANR related to deep-link handling in our app. The Thread Dump option in the debugger helped us to quickly identify the issue, which would have otherwise required a lot of guesswork and experimentation to resolve.

OutOfMemoryErrors

Is your app noticeably slower on some devices (typically the more resource-constrained, older devices), or crashing randomly after using the app for a bit?

Suggestion: Use the Memory Profiler to identify relatively large object allocations. While some devices may be able to handle a bunch of 5 MB bitmaps on screen at once, some older devices with less RAM may struggle or even crash under these circumstances.

If the app isn’t crashing, but becomes noticeably slower, then it may be that the large memory allocations are causing frequent garbage collection events. These events may cause the main thread to briefly freeze frequently. If you also have memory leaks, the garbage collector may not be able to adequately perform its job and your app may crash.

You can spot GC events in Logcat or in the memory profiler as shown below. The Logcat message corresponding to one of these might show up as something similar to: Background young concurrent copying GC freed 341905(20MB) AllocSpace objects, 119(2644KB) LOS objects, 39% free, 34MB/57MB, paused 117us total 377.330ms

The two trash can icons at the bottom show when garbage collection events occurred.

Another nicety provided by the Memory Profiler is the ability to filter memory allocations by package, and to automatically point out memory leaks. The ability to identify memory leaks this way is especially important if you aren’t using a library like LeakCanary in your day-to-day work.

App Consumes a Lot of Energy (i.e. Kills Battery)

This typically shows up in customer reviews and isn’t as noticeable during development since your device is usually plugged in, or you’re using an emulator.

Suggestion No. 1: This can be a difficult one to pinpoint. The easiest way to start is to interact/background the app while connected to the Energy Profiler. It can give you some direction, but typically isn’t as informative as the CPU or Memory profilers. It can show you when certain events are occurring, and what type of energy usage is draining the most (e.g. Network may be high when your app is making network requests). High CPU usage while the screen appears idle may indicate that you’re performing work needlessly (possibly on a background thread) and causing your users to hate you.

Even if your UI isn’t changing, your app may be processing instructions on different threads. These threads still consume energy and can contribute to battery drain, as you can see with the CPU profiler.

Idle UI, but the coroutine is “busy-waiting” indefinitely on a DefaultDispatcher thread.

Suggestion No. 2: Use Battery Historian for a system-wide audit. This is more complicated and involves setting up Docker. This will provide a lot of technical info (like CPU usage, JobScheduler events, WiFi/radio usage, etc), in addition to details of running processes. It typically won’t help you pinpoint the exact lines of code in your app, but will give a high-level view of what’s going on at a system level. The official documentation walks you through the steps to get going.

Some of the info Battery Historian will show you about your device.

Closing Thoughts

Not all users have fast, high-end devices, so it is important to always write code with performance in mind. This article just scratches the surface of performance profiling on Android, but there are plenty of other resources out there dedicated to this topic. Some important ones are:

Writing performant apps can be challenging. A good way to deliver the best experience is to be cognizant of performance implications while writing code before shipping it to your users. Your team should always be profiling new features, optimizing inefficient code, and tracking performance regressions. That’s all for now, I hope this article has sparked some interest in performance profiling your Android apps.

Interested in joining the WW team? Check out the careers page to view technology job listings as well as open positions on other teams.

--

--