Cracking JVM code, part I

Alexander Panman
Wix Engineering
Published in
3 min readNov 15, 2021

Onboarding

Photo by Andrew Seaman on Unsplash

Cracking JVM code, part II (First experiment)
Cracking JVM code, part III (JVM runtime data structures)

The vast majority of Java and JVM language software developers write code without considering how everything works under the hood. This is less common among C/C++ language developers because they generally need at least a basic understanding of memory models and computer architecture. So what about Java or Scala — is there any benefit in understanding the black box?

As it turns out, numerous interesting things happen after writing your code and running it in the production servers: compiling to bytecode, loading to JVM, running with JIT, optimizations, garbage collection during runtime, etc. Let’s dive into bytecode to see what’s really going on in greater depth. Bytecode is a helpful tool to better understand how your code works, the implementation of specific language features (exceptions, polymorphism, etc.), and high-performance JVM coding concepts.

I’ll demonstrate how the class appears along with allocating and initializing the object and calling the class method. And most importantly, I’ll provide everything you need to continue exploring and experimenting.

With that, let’s take a look at the next simple static method.

Compiling this Java code brings us to the JVM bytecode:

$ javac ByteCodeExamples.java$ ls ByteCodeExamples.*ByteCodeExamples.class ByteCodeExamples.java

ByteCodeExamples.class contains all the necessary information for JVM to load, initialize and run methods from this class.

To get a feel for what’s going on before diving any deeper, I opened ByteCodeExamples.class with a hex editor. Here’s the resulting bytecode:

First of all — the class is notably small. Everything you need is in the picture above! This is because Java was created as a language for WEB applications, so classes are sometimes sent via the network. As a result, engineers decided that each opcode (operation) must be one byte in size (hence the name bytecode). During program execution, JVM asks ClassLoader to load our class (ByteCodeExamples). Next, this HEX representation, also called bytecode stream, is searched (in ClassPath), loaded, verified, etc.

For example, let’s take a look at what’s happening at 0xF4:

0xF4: 1A 1B 60 AC

This is a good time to start with documentation, which is always your best friend.
Out of the complete document Java Virtual Machine Specification, let’s check out Chapter 7. Opcode Mnemonics by Opcode and search for our byte codes:

Now, with a little intuition and the help of specification, these numbers are no longer a thing of mystery. Even without in-depth knowledge of bytecode, we understand that two numbers are initially loaded (iload, i for integer), summed (iadd) and the result is then returned (ireturn). This describes our sum() function from the example above. This is also how JVM sees and executes this bytecode stream. Nothing complicated here.

Now, let’s go further…

Cracking JVM code, part II (First experiment)

links:
The Java® Virtual Machine Specification
Chapter 7. Opcode Mnemonics by Opcode

--

--