Garbage Collection in Java explained

Rakesh Reddy
7 min readAug 30, 2023

--

“Java garbage collection is the process by which java programs perform automatic memory management”. JVM (Java Virtual Machine) can run java applications by compiling them into byte code. When a Java programs run on the JVM, objects are created in the heap space, which is a portion of memory dedicated to the program. Eventually, some objects will no longer be needed. To free up memory, the garbage collector discovers and removes these unused objects. Garbage collection is performed by a deamon thread called Garbage collector.

Understanding the need for garbage collection:

In programming languages like C & C++, the programmer is responsible for destroying the objects. If a programmer forgets to destroy unused objects, it will eventually leads to memory leaks and after a certain point there won’t be memory available to create new objects and the application fails with an out-of-memory error. In Java, garbage collection happens automatically during the lifecycle of a program by freeing up memory, and therefore avoiding memory leaks. The garbage collection implementation lives in the JVM. Each JVM can implement its own version of garbage collection.

Object state: Dead vs Alive

When java programs run on the JVM, objects are created in the heap space. Over the lifetime of a Java application, new objects are created and released and at any point in time the heap memory consists of 2 types of objects: Dead and Alive. Alive objects are being used and referenced from the application. Dead objects are no longer used or referenced from anywhere in the application. Garbage collector detects these objects and deletes them to free up the memory. The object will not become a candidate for garbage collection until all references to it are discarded.

Lets look at how to release an object reference so that it is available for garbage collection:

making a null reference: When an object’s reference variables are changed to null, it becomes unreachable and makes it eligible for garbage collection.

null reference

reassigning the reference variable: when a reference variable of one object is reassigned to a different object, then the previous object is no longer referenced and it is unreachable and makes it eligible for garbage collection. In the below ex: first object that was referenced by st1 is now eligible for GC.

reassigning the reference

creating object inside a method: when a method is called and if objects are created inside the method, these objects becomes unreachable when the method execution is completed, which makes it eligible for garbage collection. In the below ex: once the method() execution is completed, both objects are eligible for GC.

Objects inside a method

using an anonymous object: An anonymous object’s reference id is not stored, hence the object becomes unreachable which makes it eligible for garbage collection.

anonymoys object

There are 3 basic steps in garbage collection (Mark, Sweep, Compact)

Mark: Marking objects as alive. In this step, by traversing the object graph GC identifies all the alive objects in memory. The objects that are reachable by GC are marked as alive and objects that are not reachable are considered as eligible for garbage collection.

Sweep: The second step is sweeping dead objects. After identifying alive and dead objects, GC will free the memory which contains the dead objects.

Compact: During the sweep stage, the dead objects that were removed may not be next to each other which may result in memory fragmentation. This phase ensures in arranging the objects into the contiguous blocks at the start of the heap.

Below is an illustration of how GC performs Mark, Sweep, and Compact.

Mark, Sweep, Compact

Several analysis has shown that most of the objects are short lived and running mark and compact steps frequently on all the objects in the heap would be inefficient and time consuming. In order to solve this, Java garbage collectors implement a generational garbage collection strategy that categorizes objects by age.

Generational Garbage Collection: There are 3 different classification of objects by garbage collector. Young Generation, Old Generation and Permanent Generation.

Young Generation: All the newly created objects start in young generation, which is subdivided into Eden space (where all the new objects start) and two Survival spaces (S0 -FromSpace, S1 -ToSpace)(where the objects are moved from Eden space after surviving one garbage collection cycle). When objects are garbage collected from the young generation, it is a minor garbage collection event.

flow of objects in young generation:

  1. First, objects are allocated in Eden space, while both survival spaces are empty.
  2. JVM performs a minor GC when Eden space is filled with objects (either dead or alive). Once the dead objects are removed, all the alive objects are now moved from Eden space to S0. Now, Eden and S1 are empty.
  3. When the Eden space is again filled with objects, another minor GC is performed and removes all the dead objects. This time, the alive objects from Eden space and S0 are moved to S1. Now, Eden and S0 are empty. At any given point of time one of the survivor spaces are empty.
  4. When the surviving objects reach a certain threshold of moving around the survivor space, they are moved to Old Generation.

Old Generation: Eventually, the GC moves the long-lived objects from young generation to old generation. When objects are garbage collected from old generation, then it is a major garbage collection event. A full garbage collection cleans up both young and old generations.

Permanent Generation: JVM stored the meta data such as classes and methods in the permanent generation. Starting Java8, the MetaSpace memory space replaced PermGen space. The implementation varies from that of PermGen as the heap space is now automatically resized. This avoids the problem of applications running out of memory due to the limited size of the PermGen space of the heap. The MetaSpace can be garbage collected and the classes that are no longer used can be automatically cleaned when the MetaSpace reaches its maximum size.

So, now we understood how the GC operates, lets look at different types of garbage collectors available in JVM.

Serial GC: This is a basic garbage collector and is designed for smaller applications running on single threaded environment. When this runs it leads to a “stop-the-world” event where the entire application is paused.

SerialGC

Parallel GC: A thread that performs GC along with the application execution. In a parallel collector, multiple threads are used for minor garbage collection in young generation and a single thread used for major garbage collection in old generation. Running Parallel GC also causes a “stop-the-world” event. Since it is suitable for a multi-threaded environment it can be used when lot of work needs to be done and long pauses are acceptable.

ParallelGC

Concurrent Mark Sweep (CMS GC): In the CMS GC, multiple threads are used for minor and major garbage collection. CMS runs concurrently alongside application processes in order to minimize the stop-the-world events. As a result, CMS uses more CPU than other GCs. So, if you can allocate more CPU for better performance then CMS is a better choice than Parallel GC.

CMSGC

Garbage First (G1 GC): This is the default GC collector chosen by Java. G1 GC was intended as a replacement for CMS GC and was designed for multi-threaded applications that have a larger heap size available. It is parallel and concurrent like CMS but under the hood it works differently compared to old garbage collector. Although G1 is generational, it does not have young and old generations, instead each generation is a set of regions which allows resizing in a flexible way.

It partitions the heap size into a set of equal size regions and uses multiple threads to scan them. Once the mark phase is complete, G1 knows which region contains more dead objects and performs garbage collection on that region first.

G1GC

Epsilon GC: This was released as part of JDK11 and is a do-nothing garbage collector. Epsilon GC handles memory allocation but it does not reclaim any actual memory. Once the available Java heap is exhausted, the JVM will shut down (with OutOfMemoryError). So, this might bring a question why do we need a garbage collector that doesn’t collect garbage? There could be a use case where we know that the available heap will be enough and we don’t want the JVM to utilize any resources for garbage collection. Ex: Performance Testing.

Advantages of Java Gargabe Collection: The biggest advantage of GC is, you don’t have to worry about allocating and deallocating objects after using them, GC will take care of them. GC makes Java memory-efficient, as the GC removes the unreferenced objects from heap memory to accommodate newly created objects. A number of tools let you monitor heap use and garbage collection, Java provides a variety of options for tuning the garbage collector to improve its efficiency.

--

--