Observability in Kotlin — with KTor and New Relic

Anders Sveen
ZTL Payments
Published in
4 min readMar 14, 2022
Photo by Taylor Vick on Unsplash

I loooove New Relic… Really. Everyone should try it. 😃

The amount of information you get “for free” when plugging it into your JVM is just awesome. And the APM part is only a small part of what you can do.

It enables us to understand what uses time in our application and continuously monitor (and fix) the slowest endpoints.

HTTP endpoints, external HTTP calls and database queries all in one interface with numbers and analysis.

Truly incremental, with real production behavior. Not guessing, over-designing and over-planning performance.

When we started using it for Kotlin it worked just fine. Not so much when we moved over to KTor. And KTor support is nowhere on their current road map.

But we found a solution that actually works. Read on. 😃

Why it doesn’t work out of the box

You see. New Relic works kind of like this:

  • It hooks into your JVM where it can “see” everything (java agent)
  • It has code for certain frameworks and app servers (supported frameworks)
  • It looks for these packages/classes on your classpath and “hooks into” them (kind of like AOP) to record information whenever they are called (instrumentation).
  • It knows which parts of the code is part of the same request (stacktrace), through the thread it is running on.

If they don’t support the framework you are using you can still instrument things through code or in XML (Segments is relevant as well, if you’re going down this route). But it requires you to understand a lot of the internals in the frameworks and New Relic to get things right.

To make matters “worse”, Kotlin co-routines frequently change threads. And it’s used heavily by KTor. So the New Relic agent won’t be able to follow the execution without specific knowledge of how co-routines work.

The solution we found requires very little configuration on your end, but certain choices will have to be made. And I always prefer a simple “default config” solution over fancy config and instrumentation.

The solution

Photo by Paul Skorupskas on Unsplash

After this your New Relic graphs should be updated with nice and fancy information about everything that takes time. 😃 Happy observing. 😃

What is not working

I am still trying to wrap my head around Co-routines, and I honestly doubt I’ll ever completely get there. Maybe Project Loom will solve it? 😏 But with the changes above most of the tracing works. Except…

Readers of my previous articles might know that we do streaming for large lists. Our current implementation is based on Kotlin Flow and thus the actual work is done as this is “pulled off” the Flow (flows are “cold”).

For reasons I still haven’t completely understood yet, in KTor this happens “outside” the transaction. The only way we have found to solve this so far is using token.link() from our KTor code and passing it around (in the method calls).

We do this inside the flow { emit{ token.link() … } } call but also inside the ApplicationCall.respondOutputStream() { token.link() … } call.

Oh. And we’re hiring developers. If you want to come change the banking industry with us, reach out on recruitment@ztlpay.io .

--

--