JVM internals for the Java job interview

Dalibor Plavcic
6 min readOct 22, 2023
High-level JVM internals overview

This post will provide you just enough details about the Java Virtual Machine (JVM) internals to prepare you for the Java job interview*. Understanding JVM internals also helps you become a better engineer.

### Table of content

1. The Java Virtual Machine (JVM)
2. High-level overview
- compiler
- class loader
- run-time data area
- execution engine
3. Medium-level overview
- class loader
- load
- link
- initialise
- run-time data area
- method area
- constant pool
- program counter register
- stack
- frame
- heap
- native method stack
- garbage collector
- execution engine
- interpreter
- JIT
- HotSpot
4. Low-level overview
5. Bonus:
1. Disassemble Java code and see how data is stored in the runtime data area
2. Multiple JVM specification implementation
6. Resources

Four steps to increase your learning efficiency and retention

1. 🎯 Be sure your learning is goal/target oriented!

2.❓ You can set a goal by asking a simple question that will ignite a curiosity spark inside your brain. That question can be anything about JVM topic you are interested about:
- e.g. “how can I replace and existing class inside a running JVM?”

3. 🤔 Answer the question. Use whatever resource is needed to find the answer.

4. 🔁 Repeat

The Java Virtual Machine

The Java Virtual Machine (JVM) is a virtualisation machine that provides an environment to run your Java programs.

As per The Java® Virtual Machine Specification Java SE 21 Edition:

The Java Virtual Machine is an abstract computing machine. Like a real computing machine, it has an instruction set and manipulates various memory areas at run time.

The Java Virtual Machine knows nothing of the Java programming language, only of a particular binary format, the class file format. A class file contains Java Virtual Machine instructions (or bytecodes) and a symbol table, as well as other ancillary information.

Let’s see how Java programs are represented and executed within the JVM.

High-level overview

From the high-level perspective, you need to understand that:

  • compiler
    1. reads the source code
    2. and produces byte-code
  • class loader
    3. reads the byte-code and
    4. transforms byte-code into JVM internal representation and loads it into the run-time data area
  • run-time data area
    contains data used during the program execution
  • execution engine
    5. reads data from the run-time data area and executes byte-code

🤔 Please answer the following questions to confirm your understanding of the basic JVM concepts:

  1. How is the byte-code related to the source code?
  2. Why JVM doesn’t directly read and execute source code? What are the benefits/drawbacks of the byte-code representation of the source code ?
  3. How is the byte-code represented inside the JVM run-time data area?

[WIP] Medium-level overview

Run-time data area

  • Method area
  • Constant pool
  • Program counter
  • Stack
  • Frame
  • Native stack
  • Heap

Class loader performs:

  • loading
    - find a class binary representation
    - validate binary representation
    - convert binary representation to the JVM internal representation (Class file)
  • linking
    - takes previously created Class
    - validates it’s representation
    - initialises default values for static fields (not user set values, but default)
    - resolve references
    - load it into the run-time data area (inside themethod area)
  • initialisation
    - initialise static fields to user set values

Execution engine reads bytecodes and creates / reades necessary data from the run-time data area.

  • interpreter
  • JIT
  • HotSpot

Garbage collector is keeping our heap from exploding

Native method interface allows us to interact with code writen in other languages (like C, Rust).

<IMAGE_PLACEHOLDER>

🤔 Please answer the following questions to confirm your understanding of the basic JVM concepts:

  1. Why are some run-time data areas shared across all threads, while others are “per-therad” specific/exclusive?
  2. What is causing memory leaks to happen?

Low-level overview

Low-level overview is out of the scope for this post. To learn more, please refer to the following content:

Bonus

Disassemble Java byte-code

Imagine we have a following Java source code, and we compile it using javac MyProgram.java command.

public class MyProgram {

public static int MAX_ALLOWED_USERS = 10;
public static String APP_NAME = "CIRCUIT_BREAKER";

public static void main(String[] args) {
new MyProgram().customCheck();
}

void customCheck() {
System.out.printf("max allowed users %d", MAX_ALLOWED_USERS);
}
}

🕵️‍♂ ️To check what the byte-code looks like we can use javap -v -l -p -c -s MyProgram.class which shows different areas like constant pool and methods.

🔎 Please search for the customCheck method usages to see how it resides in the constant pool , and how it is invoked within the main(String[] args) method

Classfile /Users/user/Documents/projects/blockpit/ct-tax-app/backend/uk_tax_logic/MyProgram.class
Last modified Oct 22, 2023; size 811 bytes
SHA-256 checksum daa054dafc5c6bcd8f0800117b93dfddd5488c5c9caf57a525b842a70c497ee6
Compiled from "MyProgram.java"
public class MyProgram
minor version: 0
major version: 64
flags: (0x0021) ACC_PUBLIC, ACC_SUPER
this_class: #7 // MyProgram
super_class: #2 // java/lang/Object
interfaces: 0, fields: 2, methods: 4, attributes: 1
Constant pool:
#1 = Methodref #2.#3 // java/lang/Object."<init>":()V
#2 = Class #4 // java/lang/Object
#3 = NameAndType #5:#6 // "<init>":()V
#4 = Utf8 java/lang/Object
#5 = Utf8 <init>
#6 = Utf8 ()V
#7 = Class #8 // MyProgram
#8 = Utf8 MyProgram
#9 = Methodref #7.#3 // MyProgram."<init>":()V
#10 = Methodref #7.#11 // MyProgram.customCheck:()V
#11 = NameAndType #12:#6 // customCheck:()V
#12 = Utf8 customCheck
#13 = Fieldref #14.#15 // java/lang/System.out:Ljava/io/PrintStream;
#14 = Class #16 // java/lang/System
#15 = NameAndType #17:#18 // out:Ljava/io/PrintStream;
#16 = Utf8 java/lang/System
#17 = Utf8 out
#18 = Utf8 Ljava/io/PrintStream;
#19 = String #20 // max allowed users %d
#20 = Utf8 max allowed users %d
#21 = Fieldref #7.#22 // MyProgram.MAX_ALLOWED_USERS:I
#22 = NameAndType #23:#24 // MAX_ALLOWED_USERS:I
#23 = Utf8 MAX_ALLOWED_USERS
#24 = Utf8 I
#25 = Methodref #26.#27 // java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
#26 = Class #28 // java/lang/Integer
#27 = NameAndType #29:#30 // valueOf:(I)Ljava/lang/Integer;
#28 = Utf8 java/lang/Integer
#29 = Utf8 valueOf
#30 = Utf8 (I)Ljava/lang/Integer;
#31 = Methodref #32.#33 // java/io/PrintStream.printf:(Ljava/lang/String;[Ljava/lang/Object;)Ljava/io/PrintStream;
#32 = Class #34 // java/io/PrintStream
#33 = NameAndType #35:#36 // printf:(Ljava/lang/String;[Ljava/lang/Object;)Ljava/io/PrintStream;
#34 = Utf8 java/io/PrintStream
#35 = Utf8 printf
#36 = Utf8 (Ljava/lang/String;[Ljava/lang/Object;)Ljava/io/PrintStream;
#37 = String #38 // CIRCUIT_BREAKER
#38 = Utf8 CIRCUIT_BREAKER
#39 = Fieldref #7.#40 // MyProgram.APP_NAME:Ljava/lang/String;
#40 = NameAndType #41:#42 // APP_NAME:Ljava/lang/String;
#41 = Utf8 APP_NAME
#42 = Utf8 Ljava/lang/String;
#43 = Utf8 Code
#44 = Utf8 LineNumberTable
#45 = Utf8 main
#46 = Utf8 ([Ljava/lang/String;)V
#47 = Utf8 <clinit>
#48 = Utf8 SourceFile
#49 = Utf8 MyProgram.java
{
public static int MAX_ALLOWED_USERS;
descriptor: I
flags: (0x0009) ACC_PUBLIC, ACC_STATIC

public static java.lang.String APP_NAME;
descriptor: Ljava/lang/String;
flags: (0x0009) ACC_PUBLIC, ACC_STATIC

public MyProgram();
descriptor: ()V
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 1: 0

public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=1, args_size=1
0: new #7 // class MyProgram
3: dup
4: invokespecial #9 // Method "<init>":()V
7: invokevirtual #10 // Method customCheck:()V
10: return
LineNumberTable:
line 7: 0
line 8: 10

void customCheck();
descriptor: ()V
flags: (0x0000)
Code:
stack=6, locals=1, args_size=1
0: getstatic #13 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #19 // String max allowed users %d
5: iconst_1
6: anewarray #2 // class java/lang/Object
9: dup
10: iconst_0
11: getstatic #21 // Field MAX_ALLOWED_USERS:I
14: invokestatic #25 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
17: aastore
18: invokevirtual #31 // Method java/io/PrintStream.printf:(Ljava/lang/String;[Ljava/lang/Object;)Ljava/io/PrintStream;
21: pop
22: return
LineNumberTable:
line 11: 0
line 12: 22

static {};
descriptor: ()V
flags: (0x0008) ACC_STATIC
Code:
stack=1, locals=0, args_size=0
0: bipush 10
2: putstatic #21 // Field MAX_ALLOWED_USERS:I
5: ldc #37 // String CIRCUIT_BREAKER
7: putstatic #39 // Field APP_NAME:Ljava/lang/String;
10: return
LineNumberTable:
line 3: 0
line 4: 5
}
SourceFile: "MyProgram.java"

Multiple JVM specification implementation

It is important to be aware that multiple different JVM specification implementations exists, each (JVM) having it’s pros and cons.

Some of the most popular JVMs are:

Resources

--

--

Dalibor Plavcic

Delivering custom-built software solutions | Contractor | Senior Java Software Engineer (Java/Spring/AWS) | Malmö, Sweden