Java??? and a Deep dive into JVM

Lakshini Kuganandamurthy
SLIIT Women In FOSS Community
9 min readJan 31, 2022

What is Java?

Java is a programing language developed by James Gosling and his colleagues at Sun Microsystems in 1991. This team of engineers called themselves the “Green Team”. Java was initially named “GreenTalk” and then “Oak”. It was officially renamed “Java” in 1994. The first Java version, JDK 1.0 was released in 1996. In 2010, Sun Microsystems was acquired by Oracle Corporation and Java is currently maintained by them. The latest Java version is Java SE 17 released in September 2021. Java is used to develop mobile, web, desktop GUI, cloud, gaming applications, and many more.

Ok, enough of an introduction, let’s deep dive into some of the fundamentals every Java developer needs to know!!!

JDK, JRE, JVM???

Isn’t it confusing, ok let’s clear it up!!!

JDK stands for Java Development Kit. It is a collection of supporting libraries needed to create a Java program. You may recall installing JDK, to write your first java program!!!

JRE stands for Java Runtime Environment. It is the implementation of JVM that tells how a Java program runs in a specific platform. Ok, platform here means a collection of hardware and software, i.e. is a machine + OS.

JVM???

JVM stands for Java Virtual Machine. Before talking about JVM, let’s understand what is a virtual machine.

A virtual machine is simply a machine that doesn’t exist physically, it is a program. There are two types of VMs.

  1. SVM (System-based VM) — Consists of one or more hardware that creates multiple environments to run programs. The environments/instances are independent of each other. E.g. Hypervisor, Xen
  2. AVM (Application-based VM) — No hardware involved, but is a software that enables to run other programs. E.g. JVM for Java, CLR for .NET and PVM for Perl.

It‘s time for some JVM!!!

JVM is a specification that says what needs to be done, to run a Java program, for e.g. there need to be a class with a main method, variables need to be declared with a type and name etc.. JRE contains JVM. When JRE is installed, it deploys all the codes that is required to create a JVM. When a Java program is executed only, it will create a JVM instance. If you are not executing any java program in your computer at a particular moment, then you don’t have any JVM instance on your computer at that moment. Simply, JVM instance will only exist on your computer until your Java program is running. So, if you execute 4 different Java programs at the same time, there will have 4 different JVM instances, i.e. each Java program has its own JVM instance.

So how does the JVM instance gets created???

First, the Java program gets compiled by the compiler, if in the terminal, you will use the below command, assume the name of your java file is “HelloWorld.java”.

>javac HelloWorld.java

This creates a class file named “HelloWorld.class”. To execute the class file, if in the terminal, you will use the below command.

>java HelloWorld

The “java” in the command tells the operating system to create a JVM instance. When a JVM instance is created or started, it creates a non-daemon thread.

A non-daemon thread/user thread is a high priority thread that runs in the fore-ground and are designed to do specific tasks.

So, this created non-daemon thread is said to be the main thread that will execute the main() method. Any threads created in the main method will be child threads of the main thread. Therefore, they will also be non-daemon because newly created threads inherit the “daemon” status of their parent. But you can make a non-daemon thread into a daemon thread using the setDaemon() method of the Thread class and this should be called before the start() method of the Thread class or an exception will be thrown.

Daemon threads are service provider threads, that provide services to the non-daemon thread. If no non-daemon threads are running, JVM will exit. Also, JVM before exiting will terminate the daemon threads because there aren’t any non-daemon threads running, for them to serve. This implies that for daemon threads to keep running, there must be at least a non-daemon thread running.

Too confusing??? Refer to my article on Daemon and Non Daemon threads here, to get a better picture!!!

JVM always searches for the main() method of the application, in the .class file and runs this method, i.e. public static void main(String args[]). There is a myth that when the main thread exits, the JVM dies, but it is wrong. There are two ways JVM gets destroyed or exits.

  1. If no non-daemon threads exist, i.e. all the non-daemon threads created by the application are destroyed, then JVM instance dies.
  2. Application calls exit() method.

Therefore, JVM will only exist while the application is running or when atleast one non-daemon thread exists.

Why is Java, platform independent?

If you are installing JRE in a Windows machine, it will deploy the codes which is required to create a JVM for a Windows environment. So JRE is tightly platform dependent, that is why you see different jdk packages for each operating system in the oracle site. As we saw above, when you run your java program, a JVM instance is created and this will read the class file and convert to a language the operating system understands. This is how Java became platform independent.

Below is a high-level diagram showing how JDK, JRE, and JVM are related.

Overview of JVM, JRE and JDK

Ok, let’s further talk about the JVM architecture

JVM architecture

JVM has 3 components.

  1. Class Loader
  2. Memory Area
  3. Execution Engine
JVM Architecture

First the .java file (i.e. your java code) is converted to a .class file by the compiler. JVM instance is created when the .class file is called. Then .class file is loaded into JVM using a class loader. The loaded class is stored in a memory area. There are objects, instructions in the memory area, these are executed by an execution engine.

Class Loader

Main responsibility of a class loader is taking the .class files and loading them to the memory area. This is facilitated by the following stages.

  1. Loading
  2. Linking
  3. Initialization

There are two types of class loaders.

  1. Bootstrap Class Loader
  2. Custom defined Class Loader

Loading

First, JVM reads the .class file and then it is loaded into JVM. It reads the following parameters. For each .class file, JVM stores the following information in the method area of the memory area.

  1. The fully qualified class name i.e. your project’s package name. E.g. com.lakshini.sample.firstprogram
  2. Instance variable information
  3. Immediate parent information
  4. Whether it is a class / interface / enum

After loading the .class file, JVM creates an object of type “Class”. This object of “Class” type is not the “Class” in your java code, but the special datatype in Java, predefined in java.lang package. This Class object can be used to obtain class level information such as name of class, parent name, variables and methods information.

E.g. There is an Employee class in the .class file. When JVM reads the .class file, it may check whether the Employee class has an immediate parent, if so, it will load the parent class. If parent is not specified, it will load the Employee class as the parent.

E.g. There are three classes in the .class file namely, Employee, Manager and Leave.

Inheriting Employee Class

Very first time, if JVM reads the Manager class, it checks for its parent class and loads the Employee class (because Manager class inherits the Employee class as shown above). After loading, it creates object from Class type and assigns “emp” object information into “cls” object as shown below. This “cls” object information is stored in the heap in the memory area.

// Java code to demonstrate use of Class object created by JVMpublic class Test{   public static void main(String[] args){      Employee emp = new Employee();

// Getting hold of Class object created by JVM.
Class cls = emp;
emp.getClass();
}
}
// A sample class whose information is fetched above using its Class //object.class Employee{}

When Leave class is read by the JVM, it also inherits the Employee class, but this time, JVM will not create another object of Class type and assign the Employee object information, because it is already created and stored in the heap.

Linking

The second stage in the process of loading .class files to memory area involves three different stages as follows.

  1. Verification
  2. Preparation
  3. Resolution

Verification

Java has a byte code verifier in JVM. It will check whether the .class file is safe to execute or not. This is why Java is said to be safe to execute in an environment because JVM makes sure the .class file is safe to execute.

Then How does it work ???

When loading the .class file to JVM, there is a subprogram called bytecode verifier. It verifies the following.

  1. Whether the .class file comes from a valid compiler
  2. The .class file is in the correct structure
  3. The .class file has correct formatting

If any of the above are not satisfied, JVM throws a Runtime Exception known as Verifier Exception. The thrown exception is an indication that the .class file was altered.

Preparation

This will assign default values for any instance or static variables in the classes of the .class file. Note, this is a default value and not the initial value.

E.g. static int var = 10;

This stage will assign var with 0.

There are default values for each datatype.

  1. If it is object type, it is null
  2. If it is Integer type, it is 0
  3. If it is Boolean type, it is false

Resolution

Java allows programmers to use domain specific words in the java program.

E.g. class Employee for a Payroll system or class Student for a Student Management system.

Machine does not understand Employee or class or Student. Therefore JVM replaces these symbolic links with direct links. i.e.

Student s = new Student ();
s.enroll(Student s);

JVM replaces everywhere the “s” object was used with the memory address of the “s” object.

Initialisation

This will assign the real value of the variable. It will also execute any static blocks in the .class file. Initialization must happen before each class’s active use, according to the JVM implementation.

Active use means any of the following.

  1. The use of “new” keyword, when a “new” keyword is processed
  2. Invoking a static method present in a class
  3. Assigning value for static field in the class
  4. Invoking a main method

Memory Area

Memory area of a JVM has 5 components as shown below. The size of the memory area depends on each JVM implementation.

Memory Area Architecture

Method Area

When JVM loads the class, all class information such as name of class, parent name, variables and methods information is stored in the method area.

Heap Area

All the objects’ data are stored in the heap area.

Note, for each and every JVM, there is only one method area and one heap area.

Stack

Method information, local variables information are stored in the Stack. For every thread, JVM creates a Stack. Every block of the stack is known as a frame, which stores method information. One frame per method and all local variables of that method are stored in their corresponding frame. Whenever a method exits, it will pop that frame out of the stack. When the thread terminates, the stack will be destroyed by the JVM.

PC Registers

If a non-native method is executed, PC registers will hold the information such as the memory address of the next execution. For every thread, there is a PC register.

Note that PC registers store “undefined” value for Native methods.

Native Method Stack

Holds native method information. Native methods are those which are written in languages other than java such as C/ C++. JVM implementations cannot load native methods and cannot rely on conventional stacks to store these method information, therefore a Native method stack is required.

Execution Engine

It is the central component of the JVM. It communicates with the various memory areas of the JVM. Each thread of the running Java application, is a distinct instance of the JVM’s execution engine.

It executes the .class file which is the bytecode. It reads the bytecode line by line and converts into machine code (native code) and executes them in a sequential manner. Execution engine has three main components for executing .class files. They are interpreter, JIT compiler and Garbage collector.

I hope this article helped you better understand these fundamentals!!!

Happy Learning!!!

--

--

Lakshini Kuganandamurthy
SLIIT Women In FOSS Community

A passionate individual eager to learn and improve. Associate Software Engineer, Virtusa.