You PUT It Where?!
Over here at JET ExperimentAPI is a service that helps backend teams to run Experiments. It also has hooks in the our mobile applications to allow reporting that an Experiment treatment has been observed. In our Android app this is implemented by two network interceptors added to the HTTP Client (OkHttp). These introspect special headers associated with the Experiment and record results. We are currently in the middle of migrating all of our country specific mobile applications onto a single globalised app. During the migration we noticed that these network interceptors should not be called if the user is from a country that ExperimentAPI was historically not deployed. ExperimentAPI will eventually be deprecated across the estate but in the meantime to save a bunch of failed requests we decided to restrict the usage to to only countries that it was historically deployed in.
Problem
A change in one area of our Android app (ExperimentAPI) to restrict which tenants can use the service seemingly resulted in breaking Basket updates.
ExperimentAPI has no dependencies on any Basket functionality, nor does any Basket functionality have any dependencies on ExperimentAPI in our Android app. Whilst testing of the release build for an upcoming release it was noted that updating anything in your Basket failed with a network error.
This didn’t happen on a debuggable build which usually signals a Proguard / R8 issue.
Debugging A Release Build
Android developers will be familiar with Proguard / R8. They are tools that automatically and thoroughly rewrites your application for security and optimisation. They can make your application significantly smaller by removing classes that they know will never be used (code shrinking), they can make it harder for rogue attackers to steal and understand what your code is doing by obfuscating code, and they can optimise dead code paths. Proguard is a tool that runs on all JVM projects and R8 is simply the Google reimplementation that works tighter within the Android toolchain — they are equivalent for the purposes of this post. Typically we see issues relating to removing classes that aren’t referenced by type and instead referenced by a raw String
.
For example before Proguard / R8:
// Will not cause Proguard / R8 issues
class Superman(val clark: Kent) {
fun transform() {
clark.takeOffGlasses()
}
}
// Will cause Proguard / R8 issues
class Gus(val gorman: Gorman) {
fun run() {
if (gorman.javaClass.simpleName == "Gorman") {
gorman.stealMoney()
} else {
EvilSuperman.activate()
}
}
}
After Proguard / R8:
// Will not cause Proguard / R8 issues
class S(val c: K) {
fun t() {
c.t()
}
}
// Will cause Proguard / R8 issues
class G(val go: Go) {
fun r() {
if (go.javaClass.simpleName == "Gorman") {
go.s()
} else {
E.a()
}
}
}
Our class Superman
has a reference to our class Kent
which has a method takeOffGlasses()
. Our classes, parameter, and method names have been changed to make it harder to reverse engineer and figure out what is happening. However the references, and thus linkages, are still intact. In contrast our class Gus
looks up the name of the type and compares it to a String which Proguard / R8 won't modify. When Proguard / R8 changes the name of Gorman
to Go
the comparison "Go" === "Gorman"
will fail……. uh oh. The answer here is to provide a Proguard / R8 Keep rule that keeps the reference to Gorman
intact.
class G(val go: Gorman) {
fun r() {
if (go.javaClass.simpleName == "Gorman") {
go.s()
} else {
E.a()
}
}
}
This is typically how to debug and solve Proguard / R8 issues. A much more thorough guide can be found under this blog post. But what does this have to do with our problem? Let’s come back to this but first see where our problem lies.
How To Create An AnnotatedConverterFactory
Communicating over a network requires sending representations of your code in a format that can be safely communicated such as JSON (other formats are also available) — this process is called Serialisation. On the other side of the wire the reverse process happens. JSON is turned back into representations in your code — this is called Deserialisation.
We use a type safe HTTP client in our Android app called Retrofit that supports multiple different Serialisation / Deserialisation engines. The engine we prefer is KotlinXSerialization however we have a legacy of using Gson. Gson
is an extremely old library that was popular in the early days of Android development but that should now not be used for new development. Migrating the entirety of our Android app over to KotlinXSerialization
is a task that needs to be done carefully and slowly to reduce risks. Because of this AnnotatedConverterFactory
was introduced that allowed a gradual migration.
It works by annotating your Retrofit interfaces with an annotation that acts as a marker and using this to determine which engine to use. These annotations are merely extra metadata that we can use to base extra logic off. Let’s look at an example using a hypothetical Retrofit based service SomeService
.
interface SomeService {
@GET("some/path")
fun someOperation(@Body someType: SomeType): SomeResponseType
@POST("some/other/path")
@KotlinXSerialization // Add a marker interface
fun someOtherOperation(@Body someOtherType: SomeOtherType): SomeOtherResponseType
}
class AnnotatedConverterFactory (
private val gson: Gson,
private val factories: Map<String, Converter.Factory>
) : Converter.Factory() {
// ... other Converter.Factory() methods ommitted
override fun responseBodyConverter(
type: Type,
annotations: Array<Annotation>,
retrofit: Retrofit
): Converter<ResponseBody, *>? {
for (annotation in annotations) {
val simpleName = annotation.annotationClass.simpleName
val factory: Converter.Factory? = factories[simpleName]
if (factory != null) {
return factory.responseBodyConverter(type, annotations, retrofit)
}
}
return GsonConverterFactory.create(gson)
.responseBodyConverter(type, annotations, retrofit)
}
}
val annotation = ... // The Annotation used as a marker interface
val factory = ... // Its associated Converter.Factory
val factories = MutableMap<String, Converter.Factory> = LinkedHashMap()
factories[annotation.annotationClass.simpleName] = factory
AnnotatedConverterFactory(gson, factories)
The other Converter.Factory
methods are implemented in a similar way to responseBodyConverter
so let's go over what is happening. First upon creation of an AnnotatedConverterFactory
we supply an instance of Gson
along with a Map
/ Dictionary of Converter.Factory
s keyed by the name of the annotation. For all of the annotations on a Retrofit interface method (@Get
/ @KotlinXSerialization
/ @POST
/ etc) we check if we have an associated Converter.Factory
. If we do then we use it to convert. If not then we create and use a Gson
based one expecting all the @POST
/ @GET
/ @PUT
/ @Headers
annotations to not match leaving only @KotlinXSerialization
. Pretty simple, right? And have you spotted the issue yet? Yeah neither did I at this stage.
When running this in a debuggable application (one where Proguard / R8 has not been applied) this code works correctly. However when running this in a release build things started to go wrong. Obviously we can’t simply run our application in a debuggable context and set breakpoints in a release build so we have needed to get information the old fashioned way: console log statements!
Adding a few choice statements around the affected network calls allowed us to capture the Exception
that was (thankfully) thrown. In the real application the Exception
s were caught but discarded and instead the friendly user facing error in the above screenshot was shown. This is what the full stack trace looked like:
java.lang.IllegalArgumentException: Unable to create @Body converter for class qo.l (parameter #2)
for method BasketService.updateBasket
at tp0.c0.n(Utils.java:47)
at tp0.c0.p(Utils.java:28)
at tp0.w$a.g(RequestFactory.java:267)
at tp0.w$a.f(RequestFactory.java:12)
at tp0.w$a.b(RequestFactory.java:82)
at tp0.w.b(RequestFactory.java:6)
at tp0.z.b(ServiceMethod.java:1)
at tp0.y.c(Retrofit.java:25)
at tp0.y$a.invoke(Retrofit.java:38)
at java.lang.reflect.Proxy.invoke(Proxy.java:1006)
at $Proxy32.updateBasket(Unknown Source)
at to.h.D(BasketRepository.kt:200)
at to.h.E(BasketRepository.kt:73)
at to.h$p.invokeSuspend(BasketRepository.kt:60)
at kotlin.coroutines.jvm.internal.a.resumeWith(ContinuationImpl.kt:12)
at jo0.z0.run(DispatchedTask.kt:119)
at kotlinx.coroutines.internal.n.run(LimitedDispatcher.kt:13)
at kotlinx.coroutines.scheduling.k.run(Tasks.kt:3)
at kotlinx.coroutines.scheduling.a.o(CoroutineScheduler.kt:1)
at kotlinx.coroutines.scheduling.a$c.d(CoroutineScheduler.kt:15)
at kotlinx.coroutines.scheduling.a$c.n(CoroutineScheduler.kt:29)
at kotlinx.coroutines.scheduling.a$c.run(CoroutineScheduler.kt:1)
Caused by: kotlinx.serialization.SerializationException: Serializer for class 'l' is not found. Mark the class as @Serializable or provide the serializer explicitly.
at wo0.w0.n(Platform.kt:34)
at so0.j.d(SerializersJvm.kt:23)
at so0.i.c(Unknown Source:1)
at hg.e.c(Serializer.kt:14)
at hg.b.c(Factory.kt:23)
at y40.a.c(AnnotatedConverterFactory.kt:45)
at tp0.y.f(Retrofit.java:41)
at tp0.y.h(Retrofit.java:2)
at tp0.w$a.g(RequestFactory.java:264)
... 19 more
The first thing to note is that even the stack trace is obfuscated (to a degree). This is to be expected given what Proguard / R8 does. The second thing to note is that at the bottom of the stack trace there is a reference to a SerialisationException
from KotlinXSerialization
which indicates that the application tried to serialise something using this rather than Gson
. Is that correct? According to the Retrofit interface, no it is not. But it must be. Oh wait, I am so confused!
interface BasketService {
@PUT(REQUEST_WITH_ID)
@Headers("$CONTENT_TYPE_HEADER $CONTENT_TYPE")
suspend fun updateBasket(@Path("basket_id") basketId: String, @Body updateBasket: UpdateBasket) : Basket
}
In the definition for BasketService::updateBasket
there are only the @PUT
and @Headers
annotations. We do not see @KotlinXSerialization
. So why did the code go down that path?
Proguard / R8 Mappings
When Proguard / R8 runs it spits out several files, the most interesting of which is mapping.txt
. This file contains references to all the classes / types / symbols that have been obfuscated by the tool and what their real names were. Most projects will send this to a crash reporting service so that it can deobfuscate your stack traces from production apps, saving you the hassle of trying to figure out what went wrong. This is something that is normally automatically uploaded by your CI when building your release application.
Here’s what it looks like for the affected build (note these obfuscated symbols might and do change build per build so you should always use the correct mapping.txt
file that matches your application code - i.e. the one that was created at the same time as your production application). The whole file is a lot larger but we've included the relevant lines relating to this post.
retrofit2.http.Headers -> xp0.k:
retrofit2.http.PUT -> xp0.p:
retrofit2.http.Body -> xp0.a:
com.justeat.basketapi.network.model.request.UpdateBasket -> qo.l:
com.justeat.network.KotlinXSerialization -> y40.p:
com.jakewharton.retrofit2.converter.kotlinx.serialization.Factory -> hg.b:
Have you spotted the issue yet? No me neither…. let's find out more!
Proguard / R8 Unscrambler
As a side note cross referencing the mapping file by hand can be quite a chore so a top tip is to install an Unscrambler in IntelliJ / AndroidStudio that can help you deobfuscate stack traces.
Then select it by clicking on Analyse Stack Trace Or Thread Dump
.
This works best with genuine stack traces but running it against arbitrary text seems to do a good enough job.
Closer To Understanding The Issue
Along with logging the stack trace that was thrown we can also log information about the various parameters involved in AnnotatedConverterFactory
such as what annotation.annotationClass.simpleName
resolves to and what factory
happens to be.
Update basket
Request factories 0: p hg.b
Request method annotation 0: xp0.k
Request method annotation 1: xp0.p
Request selected annotation: xp0.k
Request selected annotation: xp0.p
Request raw type: class qo.l
Request chosen factory: hg.b
Request parameter annotation 0: xp0.a
Error
When UpdateBasket
is called we see that we have a Map
containing a single entry: p
to hg.b
. If we check our mapping.txt
file we can see that this resolves to a Map
entry of p
to com.jakewharton.retrofit2.converter.kotlinx.serialization.Factory
. We don't actually find an entry for p
on its own. That's odd. Can you spot the issue yet? I bet you can! But for the benefit of the person reading this next to you let's continue.
Why do we get p
and what is it? Let's see how we add to the Map
of ConverterFactory
s again.
val annotation = ... // The Annotation used as a marker interface
val factory = ... // Its associated Converter.Factory
val factories = MutableMap<String, Converter.Factory> = LinkedHashMap()
factories[annotation.annotationClass.simpleName] = factory
So p
must refer to the annotation's simpleName
. What's that? Here's a good StackOverflow post on the subject but it boils down to the name of the class without reference to its package. We know for a fact that in our Android app we only ever add the com.justeat.network.KotlinXSerialization
annotation to factories
so shouldn't we be getting KotlinXSerialization
for the simpleName
? Yes we should, were it not for Proguard / R8 obfuscation. After obfuscation we end up with y40.p
whose simpleName
ends up being…… yes that's correct p
.
The Map
should actually contain the entry com.justeat.network.KotlinXSerialization
to com.jakewharton.retrofit2.converter.kotlinx.serialization.Factory
. This makes sense because it is what the original design intended. Associate the KotinXSerializationFactory
with the KotlinXSerialization
annotation. But remember we are now obfuscated so we simply have a Map
that knows nothing about what is inside it other than it has an entry of p
to hg.b
.
Looking at the rest of the log statements we see this:
Update basket
Request factories 0: p hg.b
Request method annotation 0: xp0.k
Request method annotation 1: xp0.p
Request selected annotation: xp0.k
Request selected annotation: xp0.p
Request raw type: class qo.l
Request chosen factory: hg.b
Request parameter annotation 0: xp0.a
Error
We notice that we have two method annotations, xp0.k
and xp0.p
. They unscramble to retrofit2.http.Headers
and retrofit2.http.PUT
respectively. This is correct as it relates to our Retrofit interface method updateBasket
. We then see that we attempt to find a factory for the Headers
annotation and fail (because there are no further logs). This is correct and the intended behaviour. Next we see that we attempt the same for PUT
and actually find a hit! Why is PUT
triggering KotlinXSerialization
to be attempted? That's wrong!
As you have already spotted — but let’s look at it again — the converter factories are looked up by simpleName
.
val simpleName = annotation.annotationClass.simpleName
val factory: Converter.Factory? = factories[simpleName]
Do we have a Converter.Factory
associated with the simpleName
of the annotation? In a debug build our PUT
annotation would look like this:
val factories = mapOf("KotlinXSerialization" to kotlinx.serialization.Factory)
val factory: Converter.Factory? = factories["PUT"]
// factory is null.
However in a Proguard / R8 build this happens:
val factories = mapOf("p" to hg.b)
val factory: Converter.Factory? = factories["p"]
// factory is non null!
The p
in the factories
Map
and the p
of the PUT
annotation refer to different types but because they (just so happen to) share the same simpleName
we find a match! Bollocks!
The Fix
Introspecting classes on the JVM gives you a few different options for how to refer to the type / symbol. As we have seen, using simpleName
produces a likelihood of unwanted collisions. Instead prefer to use canonicalName
/ qualifiedName
(if in Kotlin). These would have added a more sturdy key as the Map
entry that would not have been collided with even after obfuscation.
val factories = mapOf("y40.p" to hg.b)
val factory: Converter.Factory? = factories["xp0.p"]
// factory is null! Correct.
Not only did this affect the Basket update endpoint it would have also affected all PUT
endpoints that do not already use the KotlinXSerialization
annotation. We didn't see this issue before because a change in code just so happened to cycle the obfuscation letters around enough to cause a collision.
Cautionary Advice
Be careful when using any reflection based APIs in your application if you also use Proguard / R8. Always run a detailed test of your application in release mode prior to releasing. It’s not enough to run it in debug mode even though that might be quicker. Be aware of the files that Proguard / R8 outputs as they can be pivotal in understanding the issue
Oh and stop using Gson
.
Just Eat Takeaway.com is hiring! Want to come work with us? Apply today.