ReduKtor: How We Stopped Worrying About Bugs in Kotlin Compiler

Marat Akhin
Nov 7, 2019 · 10 min read

This article is an brief intro for the experience paper ReduKtor: How We Stopped Worrying About Bugs in Kotlin Compiler, accepted to ASE 2019.

REDUKTOR noun

re·​duk·​tor | \ -ktə(r)\

Definition of ReduKtor
: an apparatus for carrying out input reduction for Kotlin compiler inputs

TL;DR

We implemented a C-Reduce-like tool for Kotlin. Yay we!

As to why we did that…

Do compilers have bugs?

Most software projects contain bugs, and compilers are unfortunately not an exception. These bugs are usually triggered by specific inputs; for compilers such inputs are the programs being compiled. If these programs are complex, it’s harder to understand what the bug actually is, what causes it and how one can fix it.

Kotlin is a relatively new programming language from JetBrains, conceived in 2010, with version 1.0 released in early 2016. Considering its tender age, it’s of no surprise Kotlin compiler still has bugs, and in a lot of cases these bugs are triggered by quite complicated input programs.

Let’s provide an example: below is a program, which crashes Kotlin compiler, taken from the Kotlin YouTrack:

open class A1(y: String) {
val x = "[A1.x,$y]"
}
open class A2(y: String) {
val x = "[A2.x,$y]"
inner open class B1 : A1 {
constructor(p: String) : super("[B1.param,$p]")
fun foo() = x + ";" + this@A2.x + ";"
}
fun bar(): String {
// Stacktrace1
return with(A2("#bar")) {
class
C : B1("bar") {}
C().foo()
}
}
fun foo() = A2("#baz").baz()
fun A2.baz(): String {
// Stacktrace2
class C : B1("baz") {}
return C().foo()
}
}
fun box(): String {
val r3 = A2("f").bar()
if (r3 != "[A1.x,[B1.param,bar]];[A2.x,#bar];") return "$r3:"
val
r4 = A2("gg").foo()
if (r4 != "[A1.x,[B1.param,baz]];[A2.x,#baz];") return "$r:"
return "OK"
}

And here is the minimal reproducing example (MRE) created by ReduKtor:

open class A1class A2 {
open inner class B1 : A1()
fun A2.baz() {
class C : B1()
}
}

As you can see, MRE is much more concise and easy to understand. If we are to use fuzzing to test Kotlin compiler (and we actually do, but it’s a topic for another time and another paper), this situation becomes even more pronounced:

inline fun <reified T : Any> uninitializedEntry(): T {
val klass = (T)?::class.java
if (klass.isInterface) {
throw RuntimeException()
} else {
return klass.newInstance()
}
}
class ItemType
object ItemTypes {
@JvmField
val IRON_SHOVEL: ItemType = uninitializedEntry()
@JvmField
val IRON_PICKAXE: ItemType = uninitializedEntry()
/* ... 1000 more lines ... */}

After the reduction, this generated test case becomes just:

val klass = (T)?::class.java

This means we have a problem: how to create a MRE from an original compiler-crashing example. To quote the famous one:

What can we do?

Of course, we are not the first people to target this problem. One of the first attempts at input reduction was the work of A. Zeller and R. Hildenbrandt on delta-debugging (DD). DD is basically a binary search on steroids: it divides the input into N parts and then finds their minimal configuration which still reproduces the bug. The great thing about such an approach is it’s language-agnostic: it does not care what kind of input we are working with, as long as you can divide it into parts.

DD was originally applied to HTML pages which crush Firefox browser and performed marvelously, in most cases producing a minimal failure-inducing HTML page. Over the years there has been a lot of research into improving DD and applying it to other, more complex inputs, such as hierarchical delta-debugging (HDD), Perses or Pardis; but DD was The One approach which kick-started this research area.

On the other hand, we have language-specific approaches such as program slicing by M. Weiser. Slicing is an technique which takes some “point of interest” in the program and removes everything irrelevant to it. In case of compiler-crashing input reduction via slicing, a “point of interest” is a set of expressions or a program line relevant to the crash. For Kotlin, luckily, we can usually extract this information from the error message.

If we are to discuss practical applications of input reduction for compilers, it would be impossible to not mention the work of J. Regehr et al on C-Reduce which performs compiler input reduction and is a companion tool to the Csmith code generator. It is a hybrid input reduction tool, utilizing both language-agnostic and language-specific approaches to great success, and was one of the important inspirations behind our work on ReduKtor.

C-Reduce + Kotlin = ReduKtor

So, we wanted to make our own C-Reduce, but for Kotlin, code named “ReduKtor”. After the preliminary research, we decided to combine both language-specific and language-agnostic techniques, and the reason for this is two-fold. First, we believe such combination to be the best performing from the practical point of view, i.e., the easiest to make into a usable practical tool. Second, such combination allows us to measure and analyze how impactful are the individual techniques w.r.t. overall input reduction efficiency and performance.

ReduKtor has the following high-level architecture:

It takes an input Kotlin program and attempts to reduce it via one of many available transformations; a transformation is repeatedly applied until convergence, i.e., until the program stops decreasing in size. We also constantly check if the transformation preserves the error via the bug oracle, by comparing before and after error messages. If the error stops reproducing, we rollback the input and continue with the other transformations. Before we get to the interesting part (evaluation and lessons learned), let’s briefly talk about which input reduction techniques we use in ReduKtor.

Transformations

We utilized two types of input transformations: text based and syntax tree based transformations. Text based transformations do exactly what you think they do: they change the input text from one form into another, e.g., change non-empty string literals into empty string literals. Despite their obvious simplicity, in a lot of cases text transformations are the most efficient way of reducing your input, and that’s why we use them.

Another group of transformations which perform well for source code reduction are syntax tree based transformations. They are implemented as such instead of being yet another text based ones, because they either cannot be expressed or would create too many syntactically incorrect programs if performed over text. The latter may greatly influence the overall performance, as every incorrect transformation causes a rollback to the previous reduction state.

We handcrafted these transformations based on the following:

Here are some examples of said transformations, for more information we delegate you to our paper.

Slicing

Slicing is used as one of the most basic input reduction techniques, aimed at removing parts unneeded w.r.t. error. There are several different types of slicing varying in their performance and complexity; for ReduKtor we decided to implement a static backward slicing over the syntax trees, which is applied on the following levels in their respective order: intraprocedural level, function level and class level.

If we apply slicing to the following example w.r.t. line marked as “Reduction criterion”:

class Square(private val a: Double) {
fun getPerimeter(): Double = a * 4
fun getSquare(): Double = a * a
}
class Triangle(private val a: Double, private val b: Double,
private val c: Double) {
fun getPerimeter(): Double = a + b + c
fun
getSquare(): Double {
var square = 0.0
if (a * a + b * b == c * c) {
square = a * b / 2 /* Reduction criterion */
} else {
val p = getPerimeter() / 2
square = Math.sqrt(p * (p - a) * (p - b) * (p - c))
}
return square
}
}

We get this reduced example with a lot of irrelevant things removed:

class Triangle(private val a: Double, private val b: Double,
private val c: Double) {
fun getSquare(): Double {
var square = 0.0
if (a * a + b * b == c * c) {
square = a * b / 2 /* Reduction criterion */
} else { }
return square
}
}

Hierarchical Delta Debugging

HDD is the last step in the sequence of transformations and is used as a finishing tool to remove parts of the input not considered by the previous reduction steps. By running it on an already partially reduced input, we achieve better performance, as HDD speed is very dependent on the input size.

As for implementation, we actually have two. We have a classic HDD implementation which applies DD to a tree in a level-by-level fashion, and an implementation of Pardis, a state-of-the-art language-agnostic reduction technique we wanted to compare against.

A sample input, its corresponding trees before and after HDD, and the resulting file are shown below.

fun f() {
val a = 1
var b = 0
if (a != 0) {
b += 1
b += 2
}
while (b != 0) {
--b /* Reduction criterion */
}
}
fun f() {
var b = 0
while (b != 0) {
--b /* Reduction criterion */
}
}

Evaluation

For the evaluation, we run ReduKtor for Kotlin compiler version 1.3.10 on two types of input. For the first part, we used the results of Kotlin compiler fuzzer together with the code samples of the compiler bugs collected from Kotlin compiler YouTrack. For the second part, a number of real-world Kotlin projects were injected with invalid code.

We collected the two main statistics: reduction efficiency (i.e., the decrease in file size) and performance (i.e., the wall-clock time for reduction). For every test project, we ran input reduction in the following configurations.

If you’re interested in the detailed results (summarized into two not-so-easy-to-understand tables), they are available in the full paper; here we would like to highlight one of the lessons we learned from the results.

Lessons learned

A lot of work in input reduction is focused in the area of language-agnostic techniques, and they do have their undisputed advantage in being general and applicable to almost any kind of input. This may create an impression language-specific transformations are obsolete, outdated and slow.

However, we found that custom language-specific transformations outperform even the state-of-the-art language-agnostic technique (Pardis) in both performance and efficiency. They are better in reduction efficiency by up to 1.64x for 4 out of 6 project we tested (fuzzTests, bugsTests, mapdb, koin), and faster by up to 2.0x for 4 out of 6 projects (bugsTests, kotlinpoet, kfg, mapdb).

We believe this to strongly support the need for hybrid input reduction approaches, as they could best utilize the synergy between language-agnostic and language-specific techniques. If you want your tool to be good in the real world, you have to go the C-Reduce way and make it language-specific; it may not be as pretty as the researcher in you want it to be, but it will be efficient.

Real-world adoption

Unfortunately, right now ReduKtor is only a research prototype, and we do not have any solid practical adoption yet; besides the evaluation, we only used it internally (in our research group) for several quirky bugs triggered by our code in Kotlin compiler.

One of the main reasons for that is the following: to speed up reduction we use the compiler as a library, via a non-stable internal API. As this API changes between versions, one cannot currently use ReduKtor as an off-the-shelf solution for their particular compiler version and/or development environment.

We recently contacted the Kotlin compiler team and proposed them to incorporate ReduKtor in their workflow. Their feedback was positive; with their cooperation we should be able to migrate ReduKtor to a new stable API, to better support different compiler versions. If all works out, ReduKtor should be available for use as a tool sometime in 2020.

Conclusion

We do believe this is just the beginning of the ReduKtor story. With the help and feedback of the Kotlin compiler team, we plan to shape it into a practical off-the-shelf tool which can be used by regular developers, if they find themselves needing input reduction for Kotlin. Besides making our prototype into a tool, there are also a lot of research questions still left to explore.

For example, the order in which ReduKtor applies the transformations has a direct impact on both the efficiency and performance of the overall input reduction process. How can we find a quazi-optimal order for a given input, we do not know. Yet.

Another problem we only touched upon in ReduKtor was how one performs input reduction not for a single file, but for a complete project. This is very important for many modern languages which follow a “project-as-a-whole” compilation scheme, unlike C/C++ with their “file-by-file” approach. How can we do project-level simplification nice and fast, we do not know. Yet.

That’s it for now. Thank you for reading, and remember — if you don’t understand what’s happening, maybe you simply need to reduce it and try again?

Extras

ASE Conference

Research from the 34th IEEE/ACM international conference on Automated Software Engineering

Marat Akhin

Written by

ASE Conference

Research from the 34th IEEE/ACM international conference on Automated Software Engineering

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade