React Native Stability Monitoring

John Bacon | Software Architect @ Words With Friends

Background

Back in the fall, we announced that Words With Friends is transitioning the majority of its codebase to React Native. In that article, we talked about how we are A/B testing the React Native pieces against their native counterparts to ensure the transition does not negatively impact game health. We also talked about how we’re using Bugsnag to monitor crashes and app stability. In this article, I want to talk about how we’re able to maintain the same quality of stability monitoring in a React Native world as we had with a fully native app. Bugsnag has been a vital tool to support our stability workflow throughout the transition process that has enabled us to iterate quickly on issues we were not able to catch in house.

Problem

  • Crash Analytics is a vital tool for understanding, improving, and maintaining the stability of apps
  • Symbolication of crash reports allows these tools to tell engineers where in the code the problem lies
  • React Native Crash Reports in native crash reporters contain a minified and obfuscated JavaScript call stack that is hard to action on

In a native crash reporter, a React Native crash report from an unhandled JavaScript exception looks as follows:

An unhandled JavaScript exception from React Native in a native crash reporter

We do get a call stack, but it is the call stack for the native exception thrown by React Native. JavaScript does not crash, but when an unhandled JavaScript exception occurs, React Native throws a native exception that causes a crash. While these reports are still useful for tracking the volume of crashes coming from JavaScript code, it’s not immediately actionable like a typical native crash report. The JavaScript call stack is all part of the crash report message, and is also minified and obfuscated. Further, all crashes from unhandled JavaScript exceptions are grouped together because these exceptions all have the same native call stack, making it hard to see how many different issues we have.

Symbolicated call stacks with source files and line numbers that are grouped by the top stack frame are table stakes when monitoring and fixing stability issues on mobile apps, so we didn’t feel like the reports we were getting were sufficient. The call stacks we really care about in these cases are the JavaScript call stacks that trigger the native exceptions.

Solution — Bugsnag

  • Official React Native Library capable of automatically symbolicating JavaScript reports via source maps
  • Native SDKs power their React Native library under the hood, allowing us to continue to capture native crashes as well
  • Robust Indexing enables powerful search functionality which improves the efficiency of resolving stability related issues

Here is what a Bugsnag crash report looks like for an unhandled JavaScript exception in React Native:

An unhandled JavaScript exception from React Native in Bugsnag

We are able to see the source file, line number, and call stack relevant to the cause of the crash, just like a typical native crash report, and not the native call stack that is common to all unhandled JavaScript exceptions in React Native.

Familiar Workflow

While the workflow for monitoring and fixing crashes with Bugsnag is fairly standard, the goal for us was making sure we could achieve the same workflow on React Native with equal or better efficiency:

  • Monitor our crash inbox daily
  • Identify trends to see crashes rising (or falling)
  • Select a crash which we intend to fix
  • Analyze the source file and line number
  • Fix the crash, or notify a partner team if not originating in our code
  • Release an update with the fix
  • Monitor to see if the crash has disappeared
You can view top errors and their frequency over the last 14 days.
An example crash from accessing a property of an undefined object
The source file at that line number, shows us accessing the property in the call stack

Alternatives

Trade-offs

  • User-Based Trend Graphs are not yet part of their offering, but they may provide similar functionality in the future, and integrations such as Splunk allow us to create something similar ourselves. Being able to see what percentage of users are crashing on any given day is something we find extremely valuable.
  • Shareable Public Crash Links are not supported, but they do have a new feature to export a crash dump that can be shared out. When crashes do not originate in your code, being able to share crash reports with partners without giving full access to your crash reporting tools is very valuable.
  • SDK Conflicts have been an issue, since Bugsnag has had less time than other native crash reporters in the market to become battle tested with the huge variety of SDKs we use on Words With Friends — for example it had issues initially running in parallel with one of our ad networks that was also reporting on crashes
  • Pricing Model is event based, so we implemented sampling of non-fatal (handled) errors to make sure we are recording events judiciously

Integration Tips

There are some nuances with how to setup React Native and Bugsnag in order to get the most out of your integration. We learned through a few iterations how to make sure we were getting the most accurate stability reporting.

Get the Most Out of Stability Scores

  • Start the SDK As Soon as the App Starts rather than waiting to start the JavaScript client so you capture crashes that may occur before the React Native bridge loads.
  • Manually start a Session as Soon as the SDK Starts to make sure you capture the session before the app could crash, resulting in the most accurate stability scores

Leverage the Splunk Integration If Possible

With the Splunk integration, you can stream crashes in realtime and cross reference Bugsnag data with other data you’re already capturing to generate custom reports. We have found this integration valuable to build out some of the reporting features we miss from other native crash reporters like the aforementioned user-based trend graphs.

Percent of users crashing on a particular release over the last week

Upgrade JavaScriptCore on Android (or Upgrade to React Native 0.59+)

  • A bug in JavaScriptCore on Android resulted in incorrect line numbers for symbolicated crash reports, which greatly limited the usefulness of Bugsnag on Android
  • Android JSC BuildScripts is an open source library that allows developers to take a newer version of JSC on Android without having to wait for React Native to upgrade it — we used this and our line numbers started working right away (we were on React Native 0.58.6 at the time)
  • Upgrading to RN 0.59 includes the new JSC, so if you’re on 0.59 or higher, line numbers should work from day 1, though the open source library provides flexibility for future updates when the developer would like if the need arises

Leverage the Fact that Bugsnag Libraries Are Open Source

  • Easy to see new changes which helps identify the risk of taking a new version of their library
  • See if your issue already exists as an issue on the relevant library’s repo, and upvote an issue you too are seeing to help escalate priority
  • Understand an issue you are facing since you can step through their source code while debugging an issue