Poor man’s API monitoring

https://www.flickr.com/photos/neilmoralee/26624781229

When you think about API monitoring, you think about complex tools and fancy dashboards. But sometimes you don’t need to start that big.

As a user you expect a mobile app to work. 
As developers, we want to make sure of this. 
As mobile developers, we can control what happens on the device. But most apps rely on an API. This is where the content comes from. And a lot of things can go wrong on that journey:

  • API issue
    An API could go down as of many reasons. 
    This is often checked by the backend team directly via health monitoring.
  • API change
    There might a change in the API that should not break anything. But in the end, it might break one of your clients nevertheless. This could be as of miscommunication or just as of a bug in the client code. Whoever is guilty, the user won’t care.
  • Content issue
    The API might be totally healthy but still does not deliver any content as of many possible reasons. As a user, it’s still broken.

In the end as a mobile developer, I want to get alarmed when my app does not get the content it needs, for whatever reason.

Keep it simple?

At sporttotal.tv we had this problem and aimed for a simple solution. At previous companies, I used emulators to run our released apps and check via test automation like Espresso against production API. This way we could tell when a change would break existing clients.

Our new app was built test driven, so we had many unit tests but there was no UI test setup yet. At that time we didn't want to open that box, it would mean finding a tool like Firebase testing lab or similar to run those … 
So we were looking for something more simple that could work with the existing setup and this is what we ended up with.

Good old JUnit

We use unit tests via JUnit, normal JVM tests. It is very easy to run some against a real API instead of some mocked API layer.

We added a new gradle module as part of our project.

A simple project with some tests

What do we test?

We use the same use cases that we use in our app:

@ExtendWith(InjectionExtension::class)
class GameApiTest: KoinTest {

val useCase: LeaguesUseCase by inject()

@Test
fun `should return some league`() {
with(useCase.getLeagues(FOOTBALL).blockingGet()) {
size `should be greater than` 0
}
}
}

This way we test the real thing: our repositories, API layer, json parsing.
If something is not available on the JVM we can provide an empty implementation via dependency injection (as you can see we use Koin here which we also use in our apps). As you are in a test, you could even provide mocks, let's say for your Tracking component.

Speed warning

As these tests are slower than normal unit tests we should not run them for every pull request. Many solution could solve this, JUnit-5 for example offers tags for this. We ended up with simple property definition that is checked in Gradle build file:

tasks.withType(Test::class) {
exclude(if (project.hasProperty("runApiTests")) "none" else "**/*Test.class")
}

this way the tests are ignored in a normal run but we can run them via:

./gradlew -PrunApiTests testDebugUnitTest

When?

Currently, our normal tests run for commits and pull requests on CircleCi. As said, it’s probably not what you want for the monitoring. Circle supports other triggers for workflows. We decided to run a special job once per hour:

api_monitor:
triggers:
- schedule:
cron: "0 * * * *"
filters:
branches:
only:
- master

With this we use our normal build system, just added another job to our yml file.

Alarm?

The remaining question would be the notification. How do we know if tests fail? If we have some longer running content issue we don’t want our builds to be fail all the time. Why? Our master should not be marked red on github just as of some API issue.

Like many other teams, we use Slack for internal communication. So we decided to post any issues to a Slack channel:

it seems The search API has an issue

All we need to do is adding something to the test:

@Test
fun `should return some leagues`() {
verifier.verify {
with(useCase.getLeagues(FOOTBALL).blockingGet()) {
size `should be greater than` 0
}
}
}

The implementation behind runs the assertion, catch the error and forward it to Slack:

inline fun verify(assertion: () -> Unit) {
try {
assertion()
} catch (error: Throwable) {
uploader.upload(error) }
}
}

Slack provided an API so we could use a normal Retrofit interface:

interface SlackApi {

@Multipart
@POST("/api/files.upload")
fun upload(...)

for a better error message on Slack we grab the test message name:

val trace = Thread.currentThread().stackTrace[1]
"${trace.className.substringAfterLast(".")}:${trace.methodName}"

What do we test?

We ended up validating the values coming back in our models, one example:

@Test
fun `should search for replays`() {
verifier.verify {
val result = useCase.searchFor("Viktoria").blockingGet()
with(result.replays) {
size `should be greater than` 0
forEach {
it.assertValidTitle()
it.assertValidStream()
it.assserValidThumbnail()
}
}
}
}

For example we check if lists are non empty and values look valid. 
For urls like thumbnails and video stream url, we make sure these are https and follow those via HEAD requests:

inline fun Clip.assertValidStream() {
videoUrl `should be secure url on` "$id video url"
videoUrl `could be opened` "game $id"
}
infix fun String.`could be opened`(id: String) {
with(URL(this).openConnection() as HttpURLConnection) {
requestMethod = "HEAD"
check()
}
}

fun HttpURLConnection.check() {
val response = responseCode
disconnect()
if (response != 200) {
throw AssertionError(
"broken link, returns $response on ${this.url}")
}
}
A clip stream URL was broken

Did it work?

I took the name poor man’s monitoring from “poor man’s dependency injection” by Mark Seemann. Like with his approach, there is nothing wrong with a self-written tool. Sometimes, like a poor man, you should work with what you already have and not add something new from outside.

With this simple setup, we found out very easy when something was broken or simply missing. As the module is part of our normal project it is always up to date with the latest changes. Our experience showed that the only thing we might miss is updating the Koin module which would result in a failing test. With Dagger this could be checked on compile time.

This is not about replacing existing systems but testing against your API based on your client code is a great addition. And if we add UI tests now, we could mock the API layer as it’s already tested and gain performance and stability for the UI automation layer.