Java Memory Management

Peter Lee
13 min readJun 8, 2020

--

Why we should take care of Memory Management in Java? This article will answer this question and give an example in the real-world.

After reading this article, you will be understanding things:

  • Java Virtual Machine Architecture (JVM Architecture)
  • Memory Model (HEAP, Non-HEAP, Other Memory)
  • Garbage Collection
  • Monitoring & GC Tuning
  • Some notes to make good performance when developing the web application

Why we should take care of Memory Management?

Many developers don't care about Memory Management because in Java we have Garbage Collection. Garbage Collection is the process by which Java programs perform automatic memory management. Basically, The code we write by Java/Kotlin will compile to byte code (.class file) and run on Java Virtual Machine(JVM). When the application runs on JVM, the most of objects are created on HEAP memory. Eventually, some objects will no longer be needed(unreachable/unuse objects). The Garbage Collector will reclaim unuse memory to recover memory for Application, other Applications, Operating System.

In other words: “Memory management is the process of allocating new objects and removing unused objects to make space for those new object allocations” source: oracle.com

In some languages like C, we have to manage memory manually. Thus, write the application by C is very difficult. We have to allocate/deallocate variables, objects carefully because It can leak memory.

Take it simply, an object is allocated, reside on the memory which can't release in this case we call it is a leak. The memory leak is a terrible case, we should avoid them. In some cases, they aren’t memory leaks but they aren’t good and make your application run slowly. Let me show an example in my project, the rule of the company doesn’t publish source code on the internet and it’s difficult to show complex logic in here. So, I write the problem with simple code.

Assume we need to get the image’s metadata of the URLs. To simplify, we have downloaded one image to local to build an example.

Main program:

package com.min.memory.casestudy01.main;

import com.min.memory.casestudy01.entity.Metadata;
import com.min.memory.casestudy01.utils.ImageMetadataUtils;

public class ImageMetadataExample {

public static void main(String[] args) {
try {
final String url = "/Users/daudm/Desktop/2000x2000px_keepcalm.jpg";
for (int i = 0; i < 2000; i ++) {
Metadata metadata = ImageMetadataUtils.getMetadataLocalFile(url);
System.out.println(String.format("Count %d URL: %s, metadata: %s", i, url, metadata.toString()));
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}

MetadataUtils class

package com.min.memory.casestudy01.utils;

import java.awt.image.BufferedImage;
import java.io.File;
import java.util.Map;
import java.util.Set;

import javax.imageio.ImageIO;
import com.min.memory.casestudy01.entity.Metadata;

import lombok.experimental.UtilityClass;

@UtilityClass
public class ImageMetadataUtils {
public static Metadata getMetadataLocalFile(String url) {
try {
final File outputFile = new File(url);
final BufferedImage buf = ImageIO.read(outputFile);
final int width = buf.getWidth();
final int height = buf.getHeight();
final long fileSize = outputFile.length();
return new Metadata(url, width, height, fileSize);
} catch (Exception e) {
e.printStackTrace();
System.out.println(String.format("[ERROR] Get metadata from url %s: %s", url, e.getMessage()));
return null;
}
}
}

Metadata class

package com.min.memory.casestudy01.entity;

public class Metadata {
private String url;
private Integer width;
private Integer height;
private Long fileSizeInBytes;

public Metadata(String url, Integer width, Integer height, Long fileSizeInBytes) {
this.url = url;
this.width = width;
this.height = height;
this.fileSizeInBytes = fileSizeInBytes;
}

public String toString() {
return new StringBuilder()
.append("Width ").append(width)
.append(", Height: ").append(height)
.append(", Size: ").append(fileSizeInBytes)
.toString();
}
}

After reading the above code, can you spot the problem? How to spot the problem? In this case, we will need to help with some tools. In fact, we have many tools to monitor java applications. I will talk about this later. Now, we will use JVisualVM to monitor this application.

Please watch Heap section

Can you see the HEAP section in the above image? The small programming consumes 1,044,839,312 bytes (~1 Gigabyte) HEAP memory. What the hell?

Now, I will dump the heap and catch the problem. Oh !!!, BufferImage Object is very huge, it consumed 12MB !!!!!. Because one pixel consumes 3 bytes memory, we are using the image with dimension 2000x2000 (3 * 2000 * 2000 = 12MB)

BufferImage Object consumes 12 MB in HEAP memory

We found the problem!!! When we found the problem, we will find a solution to solve it. I will give you a solution. This solution isn’t the best solution but I think it’s enough good. If you have a better solution, please tell me.

We will use com.drew.imaging.ImageMetadataReader class in meta-extractor libraby to getting image’s metadata. Ref: https://jar-download.com/artifacts/com.drewnoakes/metadata-extractor/2.11.0/source-code

We will update code for ImageMetadataUtils class and main program:

package com.min.memory.casestudy01.utils;

import java.awt.image.BufferedImage;
import java.io.File;
import java.util.Map;
import java.util.Set;

import javax.imageio.ImageIO;

import com.drew.metadata.Directory;
import com.drew.metadata.bmp.BmpHeaderDirectory;
import com.drew.metadata.exif.ExifIFD0Directory;
import com.drew.metadata.gif.GifHeaderDirectory;
import com.drew.metadata.jpeg.JpegDirectory;
import com.drew.metadata.png.PngDirectory;
import com.google.common.collect.ImmutableMap;
import com.min.memory.casestudy01.entity.Metadata;
import com.drew.imaging.ImageMetadataReader;

import lombok.Builder;
import lombok.Data;
import lombok.experimental.UtilityClass;

@UtilityClass
public class ImageMetadataUtils {
@Data
@Builder
private static class NeededImageTag {
private int height;
private int width;
}
private static final Map<Class<? extends Directory>, NeededImageTag> SUPPORTED_TYPES_MAP
= new ImmutableMap.Builder<Class<? extends Directory>, NeededImageTag>()
.put(JpegDirectory.class, NeededImageTag.builder().height(JpegDirectory.TAG_IMAGE_HEIGHT).width(JpegDirectory.TAG_IMAGE_WIDTH).build())
.put(PngDirectory.class, NeededImageTag.builder().height(PngDirectory.TAG_IMAGE_HEIGHT).width(PngDirectory.TAG_IMAGE_WIDTH).build())
.put(GifHeaderDirectory.class, NeededImageTag.builder().height(GifHeaderDirectory.TAG_IMAGE_HEIGHT).width(GifHeaderDirectory.TAG_IMAGE_WIDTH).build())
.put(BmpHeaderDirectory.class, NeededImageTag.builder().height(BmpHeaderDirectory.TAG_IMAGE_HEIGHT).width(BmpHeaderDirectory.TAG_IMAGE_WIDTH).build())
.put(ExifIFD0Directory.class, NeededImageTag.builder().height(ExifIFD0Directory.TAG_IMAGE_HEIGHT).width(ExifIFD0Directory.TAG_IMAGE_WIDTH).build())
.build();
private static final Set<Class<? extends Directory>> SUPPORTED_TYPES = SUPPORTED_TYPES_MAP.keySet();

public static Metadata getMetadata(String url) {
try {
final File outputFile = new File(url);
final long fileSize = outputFile.length();
final com.drew.metadata.Metadata metadata = ImageMetadataReader.readMetadata(outputFile);
for (final Class<? extends Directory> type : SUPPORTED_TYPES) {
if (metadata.containsDirectoryOfType(type)) {
final Directory directory = metadata.getFirstDirectoryOfType(type);
final NeededImageTag tag = SUPPORTED_TYPES_MAP.get(type);
return new Metadata(url, directory.getInt(tag.width), directory.getInt(tag.height), fileSize);
}
}
return null;
} catch (Exception e) {
e.printStackTrace();
System.out.println(String.format("[ERROR] Get metadata from url %s: %s", url, e.getMessage()));
return null;
}
}
}

Main program:

package com.min.memory.casestudy01.main;

import com.min.memory.casestudy01.entity.Metadata;
import com.min.memory.casestudy01.utils.ImageMetadataUtils;

public class ImageMetadataExample {

public static void main(String[] args) {
try {
// This application runs very fast and difficult to monitor so, I will sleep in 10 seconds.
System.out.println("Sleep in 10 seconds");
Thread.sleep(10000);
final String url = "/Users/daudm/Desktop/2000x2000px_keepcalm.jpg";
for (int i = 0; i < 2000; i ++) {
Metadata metadata = ImageMetadataUtils.getMetadata(url);
System.out.println(String.format("Count %d URL: %s, metadata: %s", i, url, metadata.toString()));
}
} catch (Exception ex) {
ex.printStackTrace();
}
}
}

Then we run the application and monitor it on JVisualVM:

Our application runs very fast and only consumes 21 MB in HEAP

Eventually, I think that manage memory is very important for every developer. It doesn’t depend on programming language: Java/C, … Understanding deeper in manage memory will help you write a good application with high performance and can run on low profile machines. Basically, our application will run on JVM. So, we should understanding JVM Architecture first.

Java Virtual Machine Architecture (JVM Architecture)

JVM is only a specification and it has many different implementations. You can mapping with an interface and many implements in your code. To know JVM information you can run the command “java -version” on the terminal

If you have installed Oracle JDK, you see the information like:

Java HotSpot(TM) 64-Bit Server VM.

If you have installed Open JDK, you see “OpenJDK 64-Bit Server VM” on the terminal(with OS 64 bit). I recommend personally you should use Oracle JDK. It’s very stable and focuses on enterprise applications. So, In this article, I will only write about it.

There are many articles write about it you can search on the internet. I recommend you read it in https://medium.com/platform-engineer/understanding-jvm-architecture-22c0ddf09722

Source Image: PlatformEngineer.com

I will summarize some points in the above article:

  1. Class Loader Subsystem: JVM resides on the RAM. During execution, using the Class Loader Subsystem, the class files are brought on to the RAM. This is called Java’s dynamic class loading functionality. It loads, links, and initializes the class file (.class) when it refers to a class for the first time at runtime. Finally, initialization logic of each loaded class will be executed(eg. calling the constructor of a class), all static variables will be assigned original values & static block gets executed.

2. Runtime Data Area: the memory areas assigned when the JVM program runs on the OS

  • Method Area (shared among threads). Sometimes, we can call it by Class Area because it will store all the class level data (run time constant pool, static variables, field data, methods (data, code)). Only one method area per JVM.
  • Heap Area (shared among threads): all the variables, objects, arrays will store in here. One Heap per JVM. The Heap area is a great target for GC.
  • Stack Area (per thread): For every thread new stack at runtime will be created, for every method call, one entry will be added in the stack called a stack frame. Each stack frame has the reference for the local variable array, operand stack, and runtime constant pool of a class where the method being executed belongs.

3. Execution Engine: The byte code which is assigned in Runtime data are will be executed.

  • Interpreter: Interprets the bytecode faster but execution slowly. The disadvantage is that when one method is called multiple times, each time a new interpretation and a slower execution are required.
  • JIT Compiler: solve the disadvantage of the interpreter whenever it finds repeated code it uses JIT Compiler. It will compile the bytecode into native code(machine code). The native code is stored in the cache, thus the compiled code can be executed quicker.
  • Garbage Collector: collects and removes unreferenced objects. As long as an object is being referenced, the JVM considers it alive. Once an object is no longer referenced and therefore is not reachable by the application code, the garbage collector removes it and reclaims the unused memory. In general, the garbage collector is an automatic process. However, we can trigger it by calling System.gc() or Runtime.getRuntime().gc() method (Again the execution is not guaranteed. Hence, call Thread.sleep(1000) and wait for GC to complete).

Memory Model (HEAP, Non-HEAP, Other Memory)

The JVM consumes the available memory space on the Operating System. The JVM includes memory areas: HEAP, Non-HEAP, and Other Memory.

Overview Memory Model
  1. HEAP: includes two parts: Young Generation (Young Gen) and Old Generation (Old Gen).
Source: PlatformEngineer.com

1.1 Young Generation: all the new objects are created in here. When the young generation is filled, the Garbage collector (Minor GC) is performed. It’s divided into three parts: one Eden Space and two Survivor Spaces(S0, S1). Some points in the young generation:

  • Most of the newly created objects are located in the Eden Space.
  • If Eden space is filled with objects, Minor GC is performed and all survivor objects are moved to one of the survivor spaces.
  • Objects that are survived after many cycles of Minor GC are moved to Old Generation space. Usually, it’s done by setting a threshold for the age of the young generation objects before they become eligible to promote to the old generation.

1.2 Old Generation: This is reserved for containing long-lived objects that survive after many rounds of Minor GC. When the old generation is full, Major GC is performed (usually takes longer time).

2. Non-HEAP (Off-HEAP): Sometimes, we call it by name Off-HEAP. With Java 7 and earlier this space is called by Permanent Generation(Perm Gen). Since Java 8, Perm Gen is replaced by Metaspace. Nowadays, we won’t use Java 7 anymore because Java 8 is released in 2014 with many improvements. Besides, we have Java 11 LTS.

Metaspace stores per-class structures such as runtime constant pool, field and method data, and the code of methods and constructors, as well as interned Strings.

Metaspace by default auto increases its size (up to what the underlying OS provides), while Perm Gen always has a fixed maximum size. Two news flags can be used to set the size of the metaspace: “-XX:MetaspaceSize” and “-XX:MaxMetaspaceSize”.

3. Other Memory

3.1 CodeCache contains complied code (i.e native code) generate by JIT compiler, JVM internal structures, loaded profiler agent code, and data, etc.

3.2 Thread Stacks refer to the interpreted, compiled, and native stack frames.

3.3 Direct Memory is used by direct-buffer allocations (e.g NIO Buffer/ByteBuffer)

3.4 C-Heap is used for example, by the JIT Compiler or by the GC to allocate memory for internal data structures.

Garbage Collection

Like what I told before, GC helps developers write code without allocation/deallocation memory and don’t care about memory issues. However, In the actual project, we have many memory issues. They make your application run low performance and very slow.

Thus, we should understand how GC works. All objects are allocated on the heap managed by the JVM. As long as an object is being referenced, the JVM considers it alive. Once an object is no longer referenced and therefore is not reachable by the application code, the garbage collector removes it and reclaims the unused memory.

How to GC manage objects in HEAP? The answer is it will build a Tree called Garbage Collection Roots (GC roots). It contains many references between application code and objects in HEAP. There are four types of GC roots: Local variables, Active java threads, Static variables, JNI references. As long as our object is directly or indirectly referenced by one of these GC roots and the GC root remains alive, our object can be considered as a reachable object. The moment our object loses its reference to a GC root, it becomes unreachable, hence eligible for the GC.

GC Roots are objects that are themselves referenced by the JVM and thus keep every other object from being garbage-collected (Source: dynatrace.com)

Mark and Sweep Model

To determine which objects are no longer in use, JVM uses the mark-and-sweep algorithm.

  • The algorithm traverses all object references, starting with the GC roots, and marks every object found as alive.
  • All of the heap memory that is not occupied by marked objects is claimed.

It’s possible to have unused objects that are still reachable by an application because developers simply forgot to dereference them. This case makes memory leak. So, you have to monitor/analyze your application to spot the problem.

When objects are no longer referenced directly or indirectly by a GC root, they will be removed. Source: dynatrace.com

Stop the World Event

When GC performed, all application threads are stopped until the operation completes. Since Young Generation keeps short-lived objects, Minor GC is very fast and the application doesn’t get affected by this. However, Major GC takes a long time because it checks all the live objects. Major GC should be minimized because it will make your application unresponsive for the GC duration.

Monitoring & GC Tuning

We can monitor the java application by command line and tools. In fact, there are many tools: JVisualVM, JProfile, Eclipse MAT, JetBrains JVM Debugger, Netbeans Profiler, … I personally recommend you use JVisualVM which built-in JDK. It’s enough good for monitoring your application.

Jstat we can use jstat command line tool to monitor the JVM Memory and GC activities. Example: “jstat -gc <pid> 1000” (print memory and GC data every 1 second)

Example with `jstat` command
Describe keywords

Note: If you can't run the command or get an error: “Could not attach to <pid>”. Please run the command as the root user. If you want to know more details about the above information, you can google about it. :)

JVisualVM we can open GUI Tool via Terminal with command “jvisualvm”. I have used this tool to make an example at the beginning of this article. I personally recommend using JVisualVM for Monitoring/GC Tuning when we before releasing any features on the beta/staging/production environment. You should check memory issues to:

  • Guarantee your application consumes less memory possible.
  • Guarantee your application runs very fast and no problems with memory leaks.

Notice that your application can use native memory (Metaspace, Direct Memory) which isn’t the target of GC. In that case, you have to allocate/deallocate memory manually. When you use 3-rd party library, you should check carefully before using them. My team had a problem with 3-rd party library when we integrated it into my project. We thought it will use HEAP and create multi-instance in our application but actually it uses direct memory (ByteBuffer). When we deploy our application to the server on the beta environment, everything works ok!!. After we do performance testing with Jmeter, we have got an error: Out of memory (Memory Leak).

Java Non-Standard Options

To improve performance for your application. You should check and set java non-standard options appropriately. You can view non-standard options via command line: “java -X”. Pls, take a lock at:

Some options often use:

  • -Xms<size>[unit] (‘g’ for GB, ‘m’ for MB, and ‘k’ for KB): For setting the initial heap size when JVM starts. Default: Initial heap size of 1/64 of physical memory up to 1 GB.
  • -Xmx<size>[unit] (‘g’ for GB, ‘m’ for MB, and ‘k’ for KB): For setting the maximum heap size. Default: Maximum heap size of 1/4 of physical memory up to 1 GB.
  • -Xss<size>(‘g’ for GB, ‘m’ for MB, and ‘k’ for KB): set java thread stack size. The default depends on your OS. You can check via command line: java -XX:+PrintFlagsFinal -version | grep ThreadStackSize (Unit KB)
Default ThreadStackSize on my Mac: 1024KB

For more details about GC Tuning, you can read in https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/index.html. Sometimes, you have to dump HEAP and compare many dumps to find the problem.

Some notes to make good performance when developing the web application

  • You should limit to create new objects and release memory as soon as possible.
  • Use JVisualVM to monitor your application before releasing your application on the beta/staging/production environment.
  • Check carefully before using 3-rd party library
  • Learn and build best practices about memory leaks: Mutable Static Fields and Collections, Thread-Local Variables, Circular and Complex Bi-Directional References, ByteBuffer, BufferImage, Unclosed Stream, Unclosed Connection, …
  • Review code carefully

Summary

Thank you for your reading. This article is very long. On the Internet, you can found many good articles about memory leaks, JVM Architecture, so on. I hope this article gives you useful somethings from my experience.

--

--