Image by Lucas Wendt from Pixabay

First Peek at Java Bytecode

Vinicius Dias da Silva
Just Eat Takeaway-tech

--

Understanding Bytecode is crucial for developers as it helps them comprehend why some JVM languages can be more engaging than others. It also deepens their understanding of Java internals, which is always beneficial.

The compiler produces bytecode, which allows the JVM to understand our instructions in the execution environment. Without bytecode, our code wouldn’t be executed. The existence of bytecode abstraction makes Java an exciting platform. Java is not just a language; it’s also a platform that enables us to rethink coding and explore innovative possibilities, inspiring us to push the boundaries of what we can achieve with Java.

Let’s go and see some bytecode, then!

First, let’s write a simple Java code.

package bytecodedetails;

public class SimpleJavaCode {
public static void main(String[] args) {
String greetings = "Hello "+"World"+"!";
System.out.println(greetings);
}
}

We’re using javac and javap for compiling and reading the bytecode

example

javac SimpleJavaCode.java // compile Java Code
javap -v SimpleJavaCode // verbose mode for our Bytecode investigation

If you’re not familiar with Javap

Disassembles one or more class files.

After compilation, we will gather extensive information about this simple application. The bytecode will be divided into several parts: header information, constants, and code representation, illustrating tasks to be executed. Let’s analyze each section.

Class Headers

Constants

Here, we have an interesting section to discuss, the Constant Pool, which provides valuable insights into how the execution will perform. The comments offer specific hints about the content of each line.

In the first few lines, we see the representation of a new Object being created. Following that, we encounter a noteworthy reference at line 7, where a String representation is linked to line 8, which contains the actual String value. It’s also worth noting that only 1 String is present. This is due to the compiler’s optimization, where multiple Strings at compile time are consolidated into just one.

Bytecode representation

Let’s focus on the main method representation.

public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=2, args_size=1

Here, we can see the stack depth, the local variables (with optimization), and the arguments (String args).

Continuing with the reading, we can follow this part

0: ldc           #7                  // String Hello World!
2: astore_1
3: getstatic #9 // Field java/lang/System.out:Ljava/io/PrintStream;
6: aload_1
7: invokevirtual #15 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
10: return

First, we use the ldc instruction to load a constant, in this case, the number 7. This constant represents the greetings variable, which points to our single String with the content “Hello World!”

Next, we use the astore_N instruction to load a reference to our local variable. The number after the instruction is just a visual representation of the 1st, 2nd, or Nth invocation.

After that, we use the getStatic instruction to obtain the static reference of a class. Following this, we use the invokevirtual instruction to execute a method within our class. We will execute it with reference number 15, which corresponds to the printing of a String object, in this case, “Hello World!”.

This is a simple example of bytecode and Java compilation. You can try creating a Kotlin “Hello World!” and see that the same things will be there. They may not always be precisely the same, but if you are already familiar with Java bytecode, you can easily understand it. However, if they do the same, why do we experience different performances using Java and other JVM-based languages such as Kotlin? Also, why are these languages often a little bit slower than Java? These questions will be better understood if we try to get our class representation back to our code. Let’s try to decompile it!

Decompilers

For this demonstration, we’ll use CFR as the decompiler of choice. To begin, we will create some straightforward Kotlin code that includes a String variable called name and modify its value multiple times.


import java.util.UUID;

fun getNewName() = UUID.randomUUID().toString().replace(Regex("[^0-9]"), "")

fun main(){
var name: String
for (item in 0..10) {
name = getNewName()
println(name)
}
}

Let’s compile and decompile the code to see how Java can provide a clue.

We will use javac and CFR for a better understanding.

java -jar my_cfr_lib.jar my_class.class //from compiled code to java code

After running these commands we will have a big output but let’s focus on the most important differences.

@NotNull
public static final String getNewName() {
String string = UUID.randomUUID().toString();
Intrinsics.checkNotNullExpressionValue((Object)string, (String)"randomUUID().toString()");
CharSequence charSequence = string;
Regex regex = new Regex("[^0-9]");
String string2 = "";
return regex.replace(charSequence, string2);
}

The variations in our code stem from our conversion back to Java. Although the variable names and code structure may differ, this approach can improve our comprehension of the code. One significant change is the incorporation of a new code known as “Intrinsics,” which is a result of Kotlin’s null safety feature. Kotlin’s approach to null safety is a reason why Kotlin code may not always perform as well as Java code. Kotlin’s emphasis on safety leads to additional validation, which in turn increases the number of code instructions and the time required to execute the code.

It’s important to note that comparing the performance of Java and other JVM languages is not straightforward. While some may observe a slight memory gain in certain cases, many developers, myself included, prefer using Kotlin for its safety features, even if it means producing additional bytecode. It’s essential to recognize that performance differences are just one of many factors to consider and may not always be significant.

Considerations

Java is a versatile platform that allows developers to write code in various ways as long as it can be compiled into bytecode. It is built on a robust, secure, and reliable platform with many optimizations. Understanding bytecode is valuable for understanding how your code is represented, providing clues about optimizations and safety. Additionally, it helps us understand the reasons behind the creation of specific JVM languages and provides insight into how these languages can benefit us.

Big thanks to my friends who helped me with this article: Leonardo Henrique and Andreas Hechenberger.

Want to come work with us at Just Eat Takeaway.com? Check out our open roles.

--

--