Guide to Identify and Solve ANR Issues: Uncover the Hidden Culprits

Mickco Lai
10 min readJul 13, 2024

--

When your app grows in size and incorporates more features, it also runs resource-intensive tasks in the background. Concurrently, your app expands its user base, including users from developing countries. However, one day, you receive a warning email from the Google Play Team that raises an alarm.

The email informs you about a new anomaly in your app: the ANR (Application Not Responding) rate is higher than that of your peers and exceeds Google’s Bad Behavior threshold. This negatively affects the discoverability of your app, which is definitely not desirable for stakeholders.

Google Play defines bad behavior thresholds on these metrics. If your app exceeds these thresholds, it’s likely to be less discoverable on Google Play. In some cases, a warning could be shown on your app’s store listing to set user expectations.

Debugging ANR is a pain

What makes it particularly difficult?

  1. The root cause is unclear.
  2. Unable to reproduce. Unable to verify even it is fixed.

Why? Even experienced developers find ANRs tricky to debug. The logs can be overwhelming and easily cause you to lose focus. Fixing ANRs is not as straightforward as fixing crashes. When you check the ANR issue logs on Google Play Console, you may encounter an extensive list of issues that seems endless. What’s worse, The top 10 issues you see only represent a small fraction of the total ANR issues, and most of the issue names are unfamiliar to you. The insights and potential fixes suggested by Google are not helpful either. It becomes unclear what you should be looking at.

If you find yourself in a similar situation, congratulations! This article is meant for you. If not, it’s likely that you have already identified and resolved the issues.

Digging more insights from logs

To be honest, the ANR issues dashboard provided by Google and Firebase is not very helpful for developers in situations like these. The list of issues with similar root causes is not intelligently grouped together; instead, many of them are separated. Moreover, many of the issues involve low-level native Android messages, which can make developers feel even more helpless. But don’t worry! Here are some tips and tricks to help you extract more useful information from the logs.

#1: Investigating background ANR

There are 2 types of ANRs

  1. Foreground ANR / User-preceived ANR
  2. Background ANR

To check if a particular issue is a Foreground ANR or a Background ANR:

On Google Play Console > Android vitals > Crashes and ANRs > Add filter — Type — All ANRs > navigating to the details of an ANR > search the keyword “By issue visibility”.

If you find that most of your background ANRs involve keywords like “FirebaseInstanceIdReceiver,” “push notification,” “Broadcast of Intent,” or “Application,” chances are your application class is heavily bloated.

#2: Exploring ANR issue in Reach and devices

Many people are unaware that you can investigate ANRs not only in Android Vitals but also in Reach and Devices. This allows you to check the ANR rate for different device configurations, helping you isolate the underlying issues. Steps are below:

Reach and devices -> Overview -> Select report type to User-perceived ANR rate

Scroll down below, then you can explore ANR rate by

  • Android Version
  • CPU
  • GPU
  • RAM
  • Screen density
  • And more

#3: Analyzing Stack traces

When investigating crashes, the line of code that caused the issue is usually the most important for debugging. However, this is less likely the case when dealing with ANRs. Reading the entire stack trace can provide more clues than just focusing on the single line that triggered the ANR. This is especially true when the triggering line is within a native method.

Let’s consider an example: When you examine the stack traces, you might come across various packages unrelated to your codebase, and the line that triggered the ANR could be in a native method. However, if you pay attention to keywords like “chromium” or “webview,” it suggests a connection to the WebView. This could involve not only the WebView in your app but also third-party libraries such as AdMob or IMA SDK (which will be explained later).

#4: Search packages in Firebase

It’s likely that you have already integrated your app with Firebase Crashlytics. However, many developers are unaware that Firebase not only logs crashes but also ANR issues. Firebase offers the advantage of a search function, which can be useful when inspecting ANRs. Additionally, Firebase provides better support if you have a large number of different types of ANR issues. I recommend searching for the following:

  1. Search your third-party library packages
  2. Search your modules
  3. Search your core user flow related classes

Some common causes of inefficient code

Searching the packages suggested above can help you pinpoint the area of your code that is unoptimized. Note that I emphasized the word “area” instead of the “line” of your code that is causing the problem. When debugging ANR, developers tend to treat it like a crash issue, hoping that it is caused by a single line. In reality, it is likely that the issues are caused by a bunch of inefficient code in an area. Therefore, searching packages gives you insight into the ANR severity of that area, and you should probably focus on optimizing that.

#1: AdMob

If your app has integrated AdMob, it is probably your number one culprit in terms of ANRs. Even though its issues are not obvious as described in the Stacktraces section above, sometimes the title of the ANR issue is a native Android package, but it is actually triggered by AdMob-related code. Also, stacktraces with the keyword “chromium” also suggest that AdMob could be the culprit since AdMob utilizes WebView to show ads. Furthermore, it is extremely resource-intensive. From my experience, it could take 50 MB of memory just after AdMob initialization.

Luckily, the Google AdMob team is aware of the ANRs produced by their SDK and has a solution for it. The bad news is that it is still in beta, and some developers have reported that the solution is not effective in lowering ANRs. So please use it with caution.

https://developers.google.com/admob/android/optimize-initialization

Starting from Google Mobile Ads (GMA) SDK version 21.0.0, you can enable optimized SDK initialization and ad loading to improve the overall responsiveness of ads and help prevent “Application Not Responding” (ANR) errors on your app. This guide outlines the changes you need to make to enable these optimizations.

#2: IMA SDK

IMA SDK is another library that uses WebView. IMA SDK is extremely unstable. From its release note track records, it keeps fixing crashes for every version. Also, you can see that version 3.32.0 was deprecated even before its previous version, right after its release a few months later. Therefore, if your app is integrated with IMA SDK, I would suggest always upgrading your IMA SDK to the latest version.

#3: Misusing Reflection

I know, Reflection is so tempting to use. It makes your code look clean and avoids a lot of boilerplate code. But trust me, in the case of low-end devices, running reflection code is at least 100 times worse than running boilerplate and badly written code. Other than that, junior developers might use Reflection without actually knowing how it works behind the scenes. If that happens, refactoring for Reflection is highly recommended, and the performance gain would be obvious.

By globally search the keyword .reflect, you might be able to find the packages related to reflection and the classes you are actually using it.

  1. For Kotlin Reflection Package: import kotlin.reflect.xxx
  2. For Java Reflection Package: import java.lang.reflect.xxx

#4: Parsing Json/XML on Main Thread

Check if your app is mistakenly parsing JSON/XML or other large resources on the Main Thread. It is even better to check if you are using a parsing library that uses reflection mentioned above, such as Gson. Note that some libraries might have both reflection mode and non-reflection mode when parsing a format, and we could mistakenly use reflection not on purpose.

#5: Incorrectly access Local Storage on Main Thread

I/O operations mistakenly running on the Main Thread are easy to find out if it is a long operation, such as loading a Network API. Some operations are harder to notice if users are using high-end or mid-end devices, such as accessing SharedPreferences/DataStore on the Main Thread. For newer projects, coroutines should do the trick. For older projects using SharedPreferences, you might want to consider using apply() on MainThread instead of commit().

#6: Other common causes

  1. Misusing Inter-Process Communication (e.g. Use of Binder, BroadcastReceiver)
  2. Misusing Lock Contention (e.g. Use of synchronized)

#7: Native Method — android.os.MessageQueue.nativePollOnce

From Google Documents, it is a mystery ANR:

Note: Ignore input dispatch ANR clusters that say “nativePollOnce” or “main thread idle.” These usually correspond to ANRs where the stack dump was taken too late. They’re generally not actionable so can be ignored. In general, the actual ANR issues will be present in other clusters, so real issues are not being hidden. See nativePollOnce for more details.

Although it is not actionable to tackle this specific item, there are still some tools that can be used to find out issues that could cause ANR.

Tools to identify issues

If you have exhausted your search in the Play Console and Firebase logs without finding any further insights, you can utilize the following tools to uncover potential issues.

#1: Memory Leak Detection

ANRs can be caused by high memory usage, especially on low-end devices with limited resources. Memory leaks can be a major contributor to excessive memory consumption in your app. Therefore, it is advisable to use a monitoring tool to investigate and identify any memory leaks.

LeakCanary

LeakCanary, developed by Square, is a tool that can be integrated into your app. It not only detects memory leaks, but also allows your QA team to explore new leaks and verify fixes. However, please note that only the instances mentioned below can be currently detected.

Android Studio Memory Profiler

This tool is another useful option for detecting memory leaks. What sets it apart is its ability to identify leaks that LeakCanary may not capture. Here are the steps to use it effectively:

Open Profiler from below -> click memory from the chart -> select Capture heap dump -> Inspect “Leaks”

#2: Strict Mode

StrictMode is a powerful tool that can be integrated into your app to detect various anomalies during runtime. With ThreadPolicy, it can detect if a network or disk operation is running on Main thread. For VmPolicy, it can detect memory leaks, incorrect cleartext network calls, unsafe intents, and more. You can configure it to detect specific items based on your needs to address ANRs. I recommend using .penaltyDialog() to notify the QA team when an anomaly is encountered, thus helping to safeguard app performance. Usage:

StrictMode.ThreadPolicy policy = new StrictMode.ThreadPolicy.Builder()
.detectAll()
.penaltyLog()
.build();
StrictMode.setThreadPolicy(policy);

StrictMode.VmPolicy policy = new StrictMode.VmPolicy.Builder()
.detectAll()
.penaltyLog()
.build();
StrictMode.setVmPolicy(policy);

#3: HWUI + Android CPU Profiler

The Android CPU Profiler allows you to inspect the details of tasks currently running on the Main Thread. However, keep in mind that CPU Profiler can be slow and resource-intensive. If your app is already resource-heavy, it may significantly impact performance, making it almost unusable at around 0.5 FPS.

In such cases, the HWUI comes to the rescue. It provides visibility into when and where your screen experiences jank or lag. Once you have identified the timing and location of the jank using HWUI, you can then use the CPU Profiler to investigate the specific cause.

To turn on HWUI, navigate to Developer options:

Upon enabling, you should see colored bars appearing when you interact with your app.

Now, try to locate WHEN and WHERE the long dark green colored bar shows up.

Then, you can open CPU Profiler -> Select Java/Kotlin Method Trace -> click Record -> Reproduce the long dark green bar -> Stop recording after a few seconds -> Inspect what happended in the Main Thread

Take this one as an example, the CPU Profiler reveals that a function written by us has triggered reflections that occupied the Main Thread. Therefore, we might want to refector it to optimize the performance.

Further readings:

The measures presented here are just a subset of those aimed at addressing ANRs. For more detailed information, I suggest links below.

Last but not least, don’t forget to clap and follow me on Medium! I constantly publish high quality Android Development articles on the platform. Don’t miss out!

--

--

Mickco Lai

Lead Mobile Engineer at Viu, Ex-HSBC, Ex-Manulife, Ex-GOGOX