Reverse Engineering APIs from Android Apps — Part 1

Behind most mobile apps is some form of remote API that enables them to do all the magic we’ve grown accustom to.

However, sometimes we like what the app does, but don’t like the interface. Or maybe we want to use the data the app provides but in an automated way. For whatever reason, here are some of the ways you can reverse engineer an API from an Android app.

For this series of article I’ll be looking at a few different apps to see what different scenarios we come across.

What we’re looking for

For any API we need a few things before we can use it ourselves:

  • What is the base API URL (normally named something like api.domain.com or www.domain.com/api)?
  • What authentication method is used and do we need register an account to use it?
  • What are the available service calls and what parameters do they expect?

Getting the source

Our first point of call is to get hold of the apk of the app.

You can identify the name of the apk by searching for the app on the Google Play Store. The URL of the app will contain the name of the APK.

https://play.google.com/store/apps/details?id=[ID]

There are a number of websites that will help you download the apk (I’ve used both https://apps.evozi.com/apk-downloader/ or http://apk-downloaders.com/ before). All you need to provide it is the ID of the app.

If the app isn’t on the Google Play Store but is installed on your phone, you can copy it to your development machine using the Android Debug Bridge CLI tool.

$ adb shell pm list packages
...
package:com.google.android.talk
package:com.twitter.android
...
$ abd pull /data/app/com.twitter.android-1.apk

Application A

Extracting the source

Once we have the APK we just need to extract the code. An APK is just a simple zip file. Lets have a quick peek inside to see what we can find.

unzip [application A].apk
Archive: [application A]
inflating: META-INF/MANIFEST.MF
inflating: META-INF/IEC.SF
inflating: META-INF/IEC.RSA
inflating: AndroidManifest.xml
inflating: assets/www/cordova-js-src/android/nativeapiprovider.js
inflating: assets/www/cordova-js-src/android/promptbasednativeapi.js
inflating: assets/www/cordova-js-src/exec.js
inflating: assets/www/cordova-js-src/platform.js
inflating: assets/www/cordova-js-src/plugin/android/app.js
inflating: assets/www/cordova.js
inflating: assets/www/cordova_plugins.js
inflating: assets/www/css/app/app.css
extracting: assets/www/css/app/images/ArrowLeft.png
extracting: assets/www/css/app/images/ArrowLeft22x22.png
extracting: assets/www/css/app/images/ArrowLeft44x44.png
extracting: assets/www/css/app/images/candidate.png
extracting: assets/www/css/app/images/contactus.png
extracting: assets/www/css/app/images/facebook.png
...

A quick look at the extracted files shows files and folders that mention Cordova which means the extracted code will be a mix of CSS, HTML and JavaScript (no Java decompiling needed for this one).

Exploring the source

Since we don’t need to do any decompiling on this code, we can fire up our favorite editor and get right into finding our endpoint.

A quick grep for api returns a potentially interesting result:

/assets/www/js/app/app.js:
132 });
133 setTimeout(function () {
134: APP.apiService.getAccessToken();
135 }, 1000);
136 // }

It looks like the APP variable contains some type of API service object and also hints at some form of access token which is probably our authentication mechanism.

A quick grep for where the APP variable is created shows us:

/assets/www/js/app/app.js:
126 // navigator.splashscreen.hide();
127 // },2000);
128: window.APP = new applicationUtilsInstance();
129 $.mobile.loading('show', {
130 text: 'Loading...',

And that leads us to:

/assets/www/js/app/services/services.js:
1 'use strict';
2
3: function applicationUtilsInstance() {
4
5 }
6
7: applicationUtilsInstance.prototype = {
8 apiService : {
9 apiConstants : {

Jackpot. It looks like /assets/www/js/app/services/services.js is what we’re looking for. Lets take a quick look.

Jackpot

Well, looks like all our credentials are hard coded (they’ve also hard coded their Bing maps API key in there).

So lets list what we’ve found:

  • We have a base API URL (baseAddress)
  • We have our usernames and passwords
  • To get an access token we need to make a POST request to. A quick test verifies that this works:
$ curl --data "grant_type=password&username=[username]&password=[password]" https://[baseAddress]/token
{
"access_token": "mZn4jcUh--[snip]--7csV6GHfayqB-7kA",
"token_type": "bearer",
"expires_in": 1209599,
"userName": "[username]",
".issued": "Sun, 03 Jul 2016 18:27:49 GMT",
".expires": "Sun, 17 Jul 2016 18:27:49 GMT"
}
  • Whenever we make a request that requires authentication we need to pass an Authorization header with the value Bearer [access token]. Quick test to see if we get a valid response with our issued token:
curl -G --header "Authorization: Bearer mZn4jcUh--[snip]--7csV6GHfayqB-7kA" https://[baseAddress]/faq
[
{
"Id": 1,
"Question": "[redacted]",
"Answer": "[redacted]"
},
{
"Id": 2,
"Question": "[redacted]",
"Answer": "[redacted]"
}
.. [snip] ..
]

Conclusion

OK, so that was a really easy example. We didn’t have to decompile Java code, intercept traffic or use any other tricks to find out how it works.

There are, however, a couple of points to note about this application:

  1. Hard coding credentials are not A Good Idea (tm). If you need to restrict access to an API make people create credentials or use a provided service on the platform (like OAuth) to automatically create an account that you can later restrict or revoke access from.
  2. It looks like the authentication system in use is ASP.NET’s Identity 2.0 [1, 2].
  3. This application does not do certificate pinning. This means we can use something like mitmproxy to intercept any requests or responses we want and modify them (The new Pokemon GO doesn’t do certificate pinning either).
  4. None of the API end points appear to do any rate limiting. There are end points that return PPI information and just require you to provide a semi-predictable identification field, this means you could potentially iterate over the entire data set.

The next installment will cover a Java application that we’ll need to decompile. Stay tuned.

Resources