JVM internals for the Java job interview
This post will provide you just enough details about the Java Virtual Machine (JVM) internals to prepare you for the Java job interview*. Understanding JVM internals also helps you become a better engineer.
### Table of content
1. The Java Virtual Machine (JVM)
2. High-level overview
- compiler
- class loader
- run-time data area
- execution engine
3. Medium-level overview
- class loader
- load
- link
- initialise
- run-time data area
- method area
- constant pool
- program counter register
- stack
- frame
- heap
- native method stack
- garbage collector
- execution engine
- interpreter
- JIT
- HotSpot
4. Low-level overview
5. Bonus:
1. Disassemble Java code and see how data is stored in the runtime data area
2. Multiple JVM specification implementation
6. Resources
Four steps to increase your learning efficiency and retention
1. 🎯 Be sure your learning is goal/target oriented!
2.❓ You can set a goal by asking a simple question that will ignite a curiosity spark inside your brain. That question can be anything about JVM topic you are interested about:
- e.g. “how can I replace and existing class inside a running JVM?”3. 🤔 Answer the question. Use whatever resource is needed to find the answer.
4. 🔁 Repeat
The Java Virtual Machine
The Java Virtual Machine (JVM) is a virtualisation machine that provides an environment to run your Java programs.
As per The Java® Virtual Machine Specification Java SE 21 Edition:
The Java Virtual Machine is an abstract computing machine. Like a real computing machine, it has an instruction set and manipulates various memory areas at run time.
The Java Virtual Machine knows nothing of the Java programming language, only of a particular binary format, the class file format. A class file contains Java Virtual Machine instructions (or bytecodes) and a symbol table, as well as other ancillary information.
Let’s see how Java programs are represented and executed within the JVM.
High-level overview
From the high-level perspective, you need to understand that:
compiler
1. reads thesource code
2. and producesbyte-code
class loader
3. reads thebyte-code
and
4. transformsbyte-code
into JVM internal representation and loads it into therun-time data area
run-time data area
contains data used during the program executionexecution engine
5. reads data from therun-time data area
and executesbyte-code
🤔 Please answer the following questions to confirm your understanding of the basic JVM concepts:
- How is the
byte-code
related to thesource code
? - Why JVM doesn’t directly read and execute
source code
? What are the benefits/drawbacks of thebyte-code
representation of thesource code
? - How is the
byte-code
represented inside the JVMrun-time data area
?
[WIP] Medium-level overview
Run-time data area
- Method area
- Constant pool
- Program counter
- Stack
- Frame
- Native stack
- Heap
Class loader performs:
- loading
- find a class binary representation
- validate binary representation
- convert binary representation to the JVM internal representation (Class file) - linking
- takes previously created Class
- validates it’s representation
- initialises default values for static fields (not user set values, but default)
- resolve references
- load it into the run-time data area (inside themethod area)
- initialisation
- initialise static fields to user set values
Execution engine reads bytecodes and creates / reades necessary data from the run-time data area.
- interpreter
- JIT
- HotSpot
Garbage collector is keeping our heap from exploding
Native method interface allows us to interact with code writen in other languages (like C, Rust).
<IMAGE_PLACEHOLDER>
🤔 Please answer the following questions to confirm your understanding of the basic JVM concepts:
- Why are some
run-time data areas
shared across all threads, while others are “per-therad” specific/exclusive? - What is causing memory leaks to happen?
Low-level overview
Low-level overview is out of the scope for this post. To learn more, please refer to the following content:
Bonus
Disassemble Java byte-code
Imagine we have a following Java source code
, and we compile it using javac MyProgram.java
command.
public class MyProgram {
public static int MAX_ALLOWED_USERS = 10;
public static String APP_NAME = "CIRCUIT_BREAKER";
public static void main(String[] args) {
new MyProgram().customCheck();
}
void customCheck() {
System.out.printf("max allowed users %d", MAX_ALLOWED_USERS);
}
}
🕵️♂ ️To check what the byte-code
looks like we can use javap -v -l -p -c -s MyProgram.class
which shows different areas like constant pool
and methods.
🔎 Please search for the customCheck
method usages to see how it resides in the constant pool
, and how it is invoked within the main(String[] args)
method
Classfile /Users/user/Documents/projects/blockpit/ct-tax-app/backend/uk_tax_logic/MyProgram.class
Last modified Oct 22, 2023; size 811 bytes
SHA-256 checksum daa054dafc5c6bcd8f0800117b93dfddd5488c5c9caf57a525b842a70c497ee6
Compiled from "MyProgram.java"
public class MyProgram
minor version: 0
major version: 64
flags: (0x0021) ACC_PUBLIC, ACC_SUPER
this_class: #7 // MyProgram
super_class: #2 // java/lang/Object
interfaces: 0, fields: 2, methods: 4, attributes: 1
Constant pool:
#1 = Methodref #2.#3 // java/lang/Object."<init>":()V
#2 = Class #4 // java/lang/Object
#3 = NameAndType #5:#6 // "<init>":()V
#4 = Utf8 java/lang/Object
#5 = Utf8 <init>
#6 = Utf8 ()V
#7 = Class #8 // MyProgram
#8 = Utf8 MyProgram
#9 = Methodref #7.#3 // MyProgram."<init>":()V
#10 = Methodref #7.#11 // MyProgram.customCheck:()V
#11 = NameAndType #12:#6 // customCheck:()V
#12 = Utf8 customCheck
#13 = Fieldref #14.#15 // java/lang/System.out:Ljava/io/PrintStream;
#14 = Class #16 // java/lang/System
#15 = NameAndType #17:#18 // out:Ljava/io/PrintStream;
#16 = Utf8 java/lang/System
#17 = Utf8 out
#18 = Utf8 Ljava/io/PrintStream;
#19 = String #20 // max allowed users %d
#20 = Utf8 max allowed users %d
#21 = Fieldref #7.#22 // MyProgram.MAX_ALLOWED_USERS:I
#22 = NameAndType #23:#24 // MAX_ALLOWED_USERS:I
#23 = Utf8 MAX_ALLOWED_USERS
#24 = Utf8 I
#25 = Methodref #26.#27 // java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
#26 = Class #28 // java/lang/Integer
#27 = NameAndType #29:#30 // valueOf:(I)Ljava/lang/Integer;
#28 = Utf8 java/lang/Integer
#29 = Utf8 valueOf
#30 = Utf8 (I)Ljava/lang/Integer;
#31 = Methodref #32.#33 // java/io/PrintStream.printf:(Ljava/lang/String;[Ljava/lang/Object;)Ljava/io/PrintStream;
#32 = Class #34 // java/io/PrintStream
#33 = NameAndType #35:#36 // printf:(Ljava/lang/String;[Ljava/lang/Object;)Ljava/io/PrintStream;
#34 = Utf8 java/io/PrintStream
#35 = Utf8 printf
#36 = Utf8 (Ljava/lang/String;[Ljava/lang/Object;)Ljava/io/PrintStream;
#37 = String #38 // CIRCUIT_BREAKER
#38 = Utf8 CIRCUIT_BREAKER
#39 = Fieldref #7.#40 // MyProgram.APP_NAME:Ljava/lang/String;
#40 = NameAndType #41:#42 // APP_NAME:Ljava/lang/String;
#41 = Utf8 APP_NAME
#42 = Utf8 Ljava/lang/String;
#43 = Utf8 Code
#44 = Utf8 LineNumberTable
#45 = Utf8 main
#46 = Utf8 ([Ljava/lang/String;)V
#47 = Utf8 <clinit>
#48 = Utf8 SourceFile
#49 = Utf8 MyProgram.java
{
public static int MAX_ALLOWED_USERS;
descriptor: I
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
public static java.lang.String APP_NAME;
descriptor: Ljava/lang/String;
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
public MyProgram();
descriptor: ()V
flags: (0x0001) ACC_PUBLIC
Code:
stack=1, locals=1, args_size=1
0: aload_0
1: invokespecial #1 // Method java/lang/Object."<init>":()V
4: return
LineNumberTable:
line 1: 0
public static void main(java.lang.String[]);
descriptor: ([Ljava/lang/String;)V
flags: (0x0009) ACC_PUBLIC, ACC_STATIC
Code:
stack=2, locals=1, args_size=1
0: new #7 // class MyProgram
3: dup
4: invokespecial #9 // Method "<init>":()V
7: invokevirtual #10 // Method customCheck:()V
10: return
LineNumberTable:
line 7: 0
line 8: 10
void customCheck();
descriptor: ()V
flags: (0x0000)
Code:
stack=6, locals=1, args_size=1
0: getstatic #13 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #19 // String max allowed users %d
5: iconst_1
6: anewarray #2 // class java/lang/Object
9: dup
10: iconst_0
11: getstatic #21 // Field MAX_ALLOWED_USERS:I
14: invokestatic #25 // Method java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
17: aastore
18: invokevirtual #31 // Method java/io/PrintStream.printf:(Ljava/lang/String;[Ljava/lang/Object;)Ljava/io/PrintStream;
21: pop
22: return
LineNumberTable:
line 11: 0
line 12: 22
static {};
descriptor: ()V
flags: (0x0008) ACC_STATIC
Code:
stack=1, locals=0, args_size=0
0: bipush 10
2: putstatic #21 // Field MAX_ALLOWED_USERS:I
5: ldc #37 // String CIRCUIT_BREAKER
7: putstatic #39 // Field APP_NAME:Ljava/lang/String;
10: return
LineNumberTable:
line 3: 0
line 4: 5
}
SourceFile: "MyProgram.java"
Multiple JVM specification implementation
It is important to be aware that multiple different JVM specification implementations exists, each (JVM) having it’s pros and cons.
Some of the most popular JVMs are:
- OpenJDK (https://openjdk.org/)
- Amazon Correto (https://aws.amazon.com/corretto/)
- Azul JDK (https://www.azul.com/products/core/)
- GralVM (https://www.oracle.com/java/graalvm/what-is-graalvm/)
Resources
- The Java® Virtual Machine Specification Java SE 21 Edition (https://docs.oracle.com/javase/specs/jvms/se21/jvms21.pdf)
- James Blooms JVM Internals (https://blog.jamesdbloom.com/JVMInternals.html)
- Inside the Java Virtual Machine by Bill Venners Order Inside the JVM (https://www.artima.com/insidejvm/blurb.html)
- The JVM Architecture Explained (https://dzone.com/articles/jvm-architecture-explained)