No Cause for Concern — RxJava and Retrofit Throwing a Tantrum

Py ⚔
Py ⚔
Nov 2, 2016 · 6 min read

Heads up, we’ve moved! If you’d like to continue keeping up with the latest technical content from Square please visit us at our new home https://developer.squareup.com/blog

Last week, we found an interesting API design issue in the Throwable class of the JDK that led to bugs in RxJava and Retrofit. This is a write-up of how we found those bugs.

Assembly tracking

Monday morning, Nelson Osacky opens a pull request to enable RxJava Assembly Tracking in the debug build of Square Register Android.

Assembly Tracking makes Rx debugging easier by reporting where failing observables were created.

According to the RxJavaHooks.enableAssemblyTracking() Javadoc:

Sets up hooks that capture the current stacktrace when a source or an operator is instantiated, keeping it in a field for debugging purposes and alters exceptions passing along to hold onto this stacktrace.

Here’s an example (thanks David):

This Rx chain fails:

An exception is raised when we call subscribe, so the stacktrace shows that something went wrong when subscribing. We’re left trying to understand the error message and figuring out what all these Rx internal classes are about.

Let’s enable assembly tracking:

We get an extra cause at the end:

It’s now clear that the failure was called by the creation of the single() observable. single() expects the source observable to emit only a single item.

Let’s enable Assembly Tracking!

Build failed

Uh-oh, this one-line pull request is breaking a UI test my team wrote. I look at the failing code:

Weird, a Retrofit (1.x) call should only ever raise a RetrofitError, and we’re getting something else. Here’s the log:

The actual stacktrace of the IllegalStateException is missing.

Digging into RxJava

I look at OnSubscribeOnAssembly.java line 118:

Ugh. What is this AssemblyStackTraceException?

A RuntimeException that is stackless but holds onto a textual stacktrace from tracking the assembly location of operators.

Interesting. What does AssemblyStackTraceException.attachTo() do?

Whenever an Observable is created, OnSubscribeOnAssembly captures a stacktrace and holds on to it. Then, when an exception is thrown in the corresponding Rx chain, an AssemblyStackTraceException is added as a cause at the bottom of the exception chain, with a message string that contains the stacktrace of where the Observable was created. Neat!

We saw the result of this in our initial example:

Cause already initialized

Now that I understand how assembly tracking works under the hood, I look at the logs again:

I still don’t know where this IllegalStateException is coming from, since I don’t have a stacktrace for it. The message says Cause already initialized, and we just saw a piece of code trying to do just that:

I open Throwable.initCause():

Tada! I just found our mystery exception. What exactly is going on with cause and this? Let’s look at how a Throwable is constructed:

  public Throwable() {
}
public Throwable(Throwable cause) {
this.cause = cause;
}
}

A Throwable can be constructed with no cause, in which case the cause field is set to this to mark that the cause hasn’t been initialized. You can then call initCause() later… but only if the cause has not yet been initialized.

The cause of a Throwable can only be set once.

We’re crashing because we’re calling initCause() but the cause has already been set. Let’s look at the code again:

We’re going down the exception chain until we find an exception that has a null cause. Then we call initCause() on it. When is getCause() returning null?

Wait a minute. The cause can be nonexistent or it can be unknown. Those are two different things!

When the cause field is set to itself, the cause is nonexistent. When the cause field is set to null, the cause is unknown.

In terms of Java code, here’s the difference:

In both cases, getCause() returns null. However, if the cause is unknown (initialized to null), then initCause() will throw an IllegalStateException.

Lost Cause

Can we fix RxJava so that it doesn’t call initCause() when the cause is unknown? Well, it turns out, there is no way to check for this.

The JDK provides no API to determine whether the cause of a Throwable is nonexistent or unknown.

The only option is to try and report a failure. I open a pull request in RxJava.

After reading the Javadoc, I realize that initCause() exists for backward compatibility issues where a legacy exception class lacks a constructor that takes a cause.

initCause() should only be called right after constructing an exception, it’s wasn’t designed to add metadata to an exception chain.

While assembly tracking seems to be built on a hack, the benefits clearly outweigh the downsides of this unknown cause edge case.

Root cause analysis

Now that RxJava has proper error reporting when assembly tracking fails, I can figure out what happened in the UI test failure that triggered this investigation:

We are testing an HTTP error scenario, so Retrofit creates the raised exception with RetrofitError.httpError():

  RetrofitError(Response response, Throwable exception) {
super(exception);
this.response = response;
}
}

Bingo! RetrofitError.httpError() creates an exception with an unknown cause, which is why assembly tracking is failing.

RetrofitError only exists in Retrofit 1.x, so I submit a pull request to fix it on the 1.x branch. There won’t be a new public release, since Retrofix 2.x has been out for two years now.

Square Register is our last app not migrated to Retrofit 2. We will soon complete the migration work. In the meantime, we’ll use an internal release of Retrofit 1.x.

Epilogue

Looking into a UI test failure led us on an interesting journey! Today, we learned that:

  • A Throwable can have a nonexistent or an unknown cause; those are two different things, and there is no API to figure out which is which.
  • Assembly Tracking makes Rx debugging easier, and uses the Throwable API in a non-intended way to append metadata to a stracktrace.
  • Passing null as a cause to a Throwable constructor seems harmless, yet leads to subtle bugs years later.

Feel free to provide more insights or ask questions!

Square Corner Blog

Buying and selling sound like simple things - and they…

Square Corner Blog

Buying and selling sound like simple things - and they should be. Somewhere along the way, they got complicated. At Square, we're working hard to make commerce easy for everyone.

Py ⚔

Written by

Py ⚔

Android baker @Square. Twitter account: @Piwai

Square Corner Blog

Buying and selling sound like simple things - and they should be. Somewhere along the way, they got complicated. At Square, we're working hard to make commerce easy for everyone.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store