Behind the Scene of JVM

Nisal Pubudu
Nerd For Tech
Published in
6 min readMay 6, 2021
Photo by Jonas Leupe on Unsplash

Even though, most of the Java developers have basic idea about JVM, they hardly know what really happening under the hood of this JVM. You might already know about the magic phrase, that “JVM is there to convert your bytecode into the machine code”. Although it is not wrong, JVM is not only there for converting bytecode into a machine code. So, in this article I will discuss about JVM architecture and how it works under the hood.

Before moving into the JVM, you need to know that there are two types of Virtual Machines. Those are, System based Virtual Machine (SVM) and Application based Virtual Machine (AVM).

🔹 System based Virtual Machine (SVM)

SVMs are required one or more hardware components and it can create multiple environments to work with. Those environments are completely independent to each other.

Examples: VMWare, VirtualBox

🔹 Application based Virtual Machine (AVM)

AVMs are also known as Process based Virtual Machine. AVMs do not require any hardware components and it’s basically a software/ process that helps you to create platform to run other programs.

Examples: JVM for Java programs, CLR for .Net programs

As mentioned above, JVM is an Application based Virtual Machine and it’s completely a specification. So how we can get JVM into our machine? You just cannot download or install JVM into your machine. Remember JVM is just a process and it does not exists until you start to run a Java program. So how JVM exists when you execute a Java program? That’s where JRE comes.

Java Runtime Environment (JRE), is the minimum environment that need to run a Java application. When you download and install JRE into your machine, it will deploy a particular code which required to create a JVM instance whenever it needs.

Creating an Instance of JVM

So how JRE knows when to create an instance of JVM? In your machine you can write a Java program using a terminal or specific IDE (ex: IntelliJ IDEA, Eclipse). In this scenario I’ll use terminal to create a Java file named as “Student”.

After creating the Java file, you have to compile this Java file into a Class file using Java Compiler. Go to the Java file directory and simply type,

“Javac Student.java” on terminal and hit Enter.

It will compile your java file into a Class file. Now you can run this class file using the following command,

“Java Student” on terminal and hit Enter.

So, at this moment it asks from your OS to give a JVM instance to run this Class file. Then it creates a non-demon thread and start to execute from the initial class (Class with a main method). The instance of JVM will live until the last non-demon thread exist.

Inside of the JVM Architecture

In the JVM architecture there are 3 main components named as, Class Loader, Memory Area, Execution Engine.

JVM Architecture (image: https://javatutorial.net/jvm-explained)

1. Class Loader

Whenever we run a java program, those compiled class files will load into the Class Loader. There are 3 built-in Class Loaders in JVM,

  • Bootstrap Class Loader
  • Extension Class Loader
  • System Class-Loader

Apart from these built-in Class Loaders you can directly create an “User Defined Class Loader” in Java.

Class Loader has 3 sub-components known as, Loading, Linking, and Initialization.

1.1 Loading

Main responsibility of this Loading phase is to load compiled classes into the Memory Area. Usually, the loading process start with the Main class (The class with the main method). Before loading each class into the Memory Area, this Loading phase will do the following tasks,

  • Read fully qualified class name
  • Read immediate parent class information
  • Read variable information
  • Check whether it related to Class/ Interface/Enum

Whenever a class load for the very first time, JVM creates an object from a Class type. This Class type object only creates 1 per a class and will be stored in the Heap Area.

1.2 Linking

Linking has 3 sections known as, Verification, Preparation, and Resolution.

🔹 Verification

In this verification phase it uses a sub program called “Byte-code verifier” to verify class and make sure it is safe to execute. So, it verifies if the class is compiled by a valid compiler and class files are written according to correct java structure/ format.

If the verification failed in this phase, JVM throws an exception (java.lang.VerifyError) and terminates the java application.

🔹 Preparation

In this phase it assigns default values for the all the variables in class file. Every data type has a specific default value. Such as, for Boolean variables it will assign “false” as the default value and “0” is the default value for integer variables.

🔹 Resolution

So, in a program we create objects and use them very frequently. While creating objects we define them with different names. But JVM cannot understand those domain specific names. So, in order to machine to understand it, JVM replace a symbolic reference/ link with it.

1.3 Initialization

Initialization is the final phase of the Class Loader and in this phase, it assigns all the real values that we defined in the java files. Also, if the class has any static block, that will get executed in this Initialization phase.

2. Memory Area (Runtime Data Area)

Memory Area has 5 sub-areas named as, Method Area, Heap Area, Stack, PC Register, and Native Method Area.

The first two areas (Method Area, Heap Area) are only create 1 per JVM. That means it doesn’t care if your program has multiple threads, those threads have to share these 2 among them. But other three areas (Stack, PC Register, Native Method Area) are creates per a thread.

2.1) Method Area — Method Area keeps all the class level information and data such as variable information including static variables.

2.2) Heap Area — Heap Area holds all objects and their information.

2.3) Stack — Stack stores methods and their information. As mentioned above a stack creates per a thread. If a thread has several methods, a stack-frame get created for each method inside the stack.

2.4) PC Registers — It holds information about next execution for non-native methods. If the next execution is about native method, then PC Register get undefined. After the execution of that native method, PC Register get the information about next execution for a non-native method.

2.5) Native Method Area — If your program loading a native method, information of that native method will be store in here.

Tip: Native methods in java is basically methods but written in other languages, such as C and C++.

3. Execution Engine

This is where the actual execution of the byte-code happens. The execution engine has 3 components under it, Interpreter, JIT-Compiler, and Garbage Collector.

3.1) Interpreter

The Interpreter interprets (converts) byte-code into the machine code and execute it line by line. It can interpret byte-code lines quickly, but if there is a method that called multiple times, that each method-call requires a new interpretation. So that makes the execution much slower.

3.2) JIT Compiler

The Interpreter is facing to a problem whenever there is a method that called multiple times. So that’s where JIT Compiler comes to save the day. What JIT Compiler does is, it compile whole byte-code into machine code. Then this machine code will directly use for those “repeated method calls”. By this way it makes much faster than the Interpreter.

3.3) Garbage Collector

Garbage Collector is a demon thread that always runs in the background. The main task of Garbage Collector is to check for any “unused objects” in the Heap Area and destroy them. So, it free up the memory from the Heap Area by destroying those unused/ unreachable objects.

References

Java Virtual Machine. 2016. [video] Directed by K. Dinesh. https://www.youtube.com/playlist?list=PLD-mYtebG3X-rF1hU16AC3Rf9E-mAAkXJ: YouTube.

--

--