Java vs. Kotlin — Part 2: Bytecode

Published in

RSQ Technologies

8 min readOct 3, 2019

Previous part started with the introduction on how I have come up with the idea of comparing Java and Kotlin. As mentioned previously, that experiment and analysis were part of my master's degree thesis at Poznan University of Technology. Also, as an everyday Android developer in RSQ Technologies and a big fan of these two JVM languages, I wanted to answer one of the most common question which can be asked by JVM developers -

What language is better performance-wise— Java or Kotlin?

Of course, there is one and the only answer to that question — it depends. No one likes this statement, so I’ve decided to conduct the experiment which may help understand the performance differences between these two languages.

Performance-linked results and conclusions were presented in part one. Now it is time to look at the bytecode static analysis results. I wanted to find the reason why there are differences in execution time, memory usage and CPU load. Mostly if you want to understand something better in JVM language and find a reason for something — you need to do some geekery with JVM bytecode. So that is exactly what I am going to do — the static analysis of Kotlin and Java generated bytecode.

Research question

*Icon made by Freepik from* *www.flaticon.com*

As in the previous part, in the beginning, I want to present to you my research question (which is pretty short this time):

What are the differences in generated bytecode produced by both language compilers?

In general, I wanted to get some knowledge about the JVM bytecode produced by Java and Kotlin implementations. I assumed that the distinction between those languages might help with dynamic results understanding. Before the experiment, I knew only that Kotlin treats the objects differently (would not let initialize variables with primitive types) and does some work for a programmer (generating getters and setters). But this was it, I did not have any deeper knowledge on JVM bytecode generation. I hoped that this analysis would not only help me understand the differences in both languages but also broaden my knowledge in that area.

Reminders

In this section, I want to remind you of the experiment's basic information — benchmark selection, implementations, and languages version. If you need some more information, it is probably already presented in the previous part.

Benchmark selection

Presented benchmarks are part of The Computer Language Benchmark Game. The whole suite consists of 10 different problems and algorithm solutions which are implemented in various languages to compare the performance results. Not all of the benchmarks were included in this experiment, they were selected based on the two factors:

best Java implementation taken from CLBG repository has to be convertible to Kotlin language
the programs must manipulate on as diverse as possible data

The table below is presenting the final list of used benchmarks with information about most manipulated data.

Table 1: Selected benchmarks with information about most manipulated data

Implementations

Analyzed implementations were divided into three groups:

Java (taken from the official CLBG benchmark webpage)
Kotlin-converted (generated using Jetbrains Java to Kotlin converter)
Kotlin-idiomatic (based on Kotlin-converted version with introduced changes recommended by Idioms, Coding Conventions, and IDEA default code inspections)

If you are interested in more implementation details, check out the Java vs Kotlin comparison repository.

Languages version

Just as a reminder, I wanted to present you a table showing versions of languages used in the experiment. Both languages used the newest available version at the time.

Table 2: Languages version

Remarks

Languages and their versions change in time. Results presented in this article might not be valid in the future when new versions will be released

Tool

Bytecode analysis is based on the JarScan tool. Mentioned program is part of the JITWatch system, which statically analyses jar files and counts the bytes in each method’s bytecode. After jar files scan, it produces CSV format reports. Those .csv summaries are used to process and produce final bytecode analysis results.

With JarScan there is a possibility to define what package in jar file should be analyzed. Thanks to that option, there is no possibility that the result will be affected by the packages and classes from Java or Kotlin standard library.

If you are interested in trying out the tool, I recommend to check out the article “Statistical Analysis of Core Libs Bytecode using JarScan” by Chris Newland. It helped me with the basic understanding and running JarScan scripts.

Static metrics

*Icon made by srip from* *www.flaticon.com*

Instruction count
Using the mentioned tool enables a user to gather the number of occurrences of each bytecode instruction in the selected jar file. With that information, there is a possibility to define differences in bytecode generation between multiple implementations and JVM-based languages.
Using information about each instruction can help with analyzing other collected data.

Allocation count
Another method from JarScan enables a user to produce the list of allocated types for each kind of allocation instruction.
Counted allocation instructions are new, newarray, anewarray, multianewarray.
In the final report, each row contains three values — instruction, type, and count.

Method sizes
The last method used in bytecode analysis is called methodSizeHisto. It produces .csv file with two information in the row — method bytecode size and their count.
Output results can be used to draw .jar file method bytecode size histograms, which may help with in-depth performance issues analysis.

Results

Full list of results for each benchmark is available in the project repository.

Instruction count
The table below is presenting five of the most frequently occurring instructions in each benchmark and each implementation.

Allocation count
Table 4 presents the most frequently allocated classes in each benchmark and each implementation. The list skips allocations with less than two occurrences.

Method sizes
The Figure below presents the method size histograms for each benchmark and each implementation.

Conclusions

The above charts show that for five out of six cases, Kotlin produces more statements and creates more allocations. Java generated bytecode has more statements and allocations only in one benchmark, namely Fannkuch Redux. Based on that, we can assume that JVM compiler produces fewer allocations and shorter bytecode while compiling the Java code instead of Kotlin.

Kotlin allocation total count may have been slightly overstated due to the additional static main method which would invoke method from inside of benchmark class. With all of that, every Kotlin generated bytecode has two additional allocations. This solution may not impact the dynamic metrics, but the outcome is visible in static analysis results.

Method size histograms presented in the previous section, show us the count of every method size in the bytecode file. In this case, Kotlin seems to produce extreme values more often. Kotlin-idiomatic code has the highest method sizes in three benchmarks Fasta, Mandelbrot, Binary Trees, while the Kotlin-converted code has the largest method sizes in the remaining benchmarks — N-Body, Fannkuch Redux, Spectral Norm.

Method size count analysis also shows that Kotlin code tends to have a mostly greater amount of smaller methods than Java. Results show that only Kotlin-idiomatic and converted implementations produce more than five methods of the same size. Chris Newland in his article presents results of the JVM bytecode method size experiment that may help with understanding that exception. He concludes that most of the 5 bytes long methods are getters.
The results of this experiment allow us to conclude that the increased amount of five bytes long methods is probably connected with Kotlin automatic generation of getters and setters for the class fields.

The impact, which the allocation count and instruction count may have on dynamic metrics like execution time and memory consumption, is not clear and easily noticeable. We cannot draw direct conclusions based on the given data. Trends visible on bar charts created from bytecode static analysis results do not reflect trends on the charts created from the execution time median and memory usage median.

It is possible that those bytecode differences between implementations which may impact the performance are cleared by the JIT compiler. A number of optimizations are introduced in runtime by the JIT compiler to handle the inefficient bytecode instructions and structure. There are multiple processes in JIT, which may help with handling unnecessary bytecode instructions and allocations. One of the phases in JIT is inlining, the process by which smaller methods are merged, or inlined into the place of their callers.

And again… what’s the answer to the research question?

By looking at the results obtained from static bytecode instructions analysis, we can say that Java code overall, produces shorter bytecode. Java implementation had the least number of total instructions count in five out of six benchmarks. In one particular case, the analysis showed that compiler generated even 57.3% fewer instructions in the Java code than in the corresponding Kotlin-idiomatic code.

The same as in the case of the previous metric, Java implementation also seems to produce the least allocations within JVM bytecode. A higher number of allocations in Java occurs only in Fannkuch Redux implementation bytecode.

The last examined metric was the size of bytecode methods. The experiment outcome showed that Kotlin code tends to generate more outermost method size values. Kotlin bytecode files in all the cases produced the largest method and also the highest amount of small-sized methods.

Based on achieved results, we cannot draw unambiguous conclusions saying that the total number of statements, the total number of allocations or more extreme bytecode method sizes significantly impact the runtime performance. There is no clear trend in bytecode analysis results which undoubtedly reflects on dynamic metrics.

That’s it!

*Icon made by surang from* *www.flaticon.com*

Thanks for reading! These two articles summarize more than a year of my work on the master thesis experiment — methodology, results, and conclusions drawn on them. I am really happy if it was interesting for you, shared information was valuable for your work or just motivated you to do some more geekery with JVM understanding.

It is important to say — that analysis (both on performance and JVM bytecode) is not the complete source of the truth about the differences between these two languages. It is just one step that might help with understanding those distinctions. It is always good to get some more data and knowledge about that area!

And again, if you found any errors or just want to share your thoughts and opinions on my work — reach out to me in the comments section, Twitter or official Kotlin slack (Jakub Aniola).