ANDROID DEVELOPMENT

Android Garbage Collection in a Nutshell

Sayan Subhra Banerjee
9 min readMar 10, 2022

Have you ever wondered how Garbage Collection works under the hood in Android? 🤷‍♂️

The term “under the hood” basically, boils down to these 4 simple questions:

  1. What is GC?
  2. How it works internally?
  3. What happens when it fails?
  4. Who is eligible for GC?

On your marks, get set, GO!!!!

Photo by Clemens van Lay on Unsplash

1. What is GC?

GC is a mechanism of collecting unused objects and freeing up (recycling ♺) the memory space for further usage.

2. How it works internally?

Till now, there were 5 versions of GC, in the Android OS, namely:

  1. Dalvik GC (Until KitKat): “Stop the World GC”
  2. ART GC (Lollipop and Marshmallow): Generational GC
  3. ART GC (Nougat): ART/Dalvik Android Team rewrites the entire allocation process in assembly code.
  4. ART GC (Oreo): “Concurrent Copying GC”
  5. ART GC (Android 10 onwards): Re-introduced “Generational GC” as an extension of powerful “Concurrent Copying GC”!

Dalvik “Stop the World GC”

Until KitKat

This GC runs on : “Concurrent Mark-and-Sweep(CMS) algorithm

Let’s understand this algorithm:

Step 1: Find Roots:

  • The GC pauses all the threads in the system to find the roots (we will talk about the “roots” later in this course. At a very high level, for now, just think of it as a root node in a tree).
  • This phase is not concurrent, so identifying roots takes time, and during this time the application cannot do anything.
  • In this phase, one boolean flag “marked” is set to “false” by default for each and every object.

Step 2: Mark Reachable Objects:

  • The objects which have a direct or indirect reference to the roots are marked as REACHABLE in this phase! (We’ll also talk about Object Reachability at later point of this article. Please, bear with me! 🙏)
  • This phase is concurrent.
  • In this phase, that boolean flag “marked” is set to “true” to each and every object which is Reachable.
  • Unreachable objects are not collected immediately in this phase. It’s deferred until all available memory has been exhausted.
  • Depth-First-Search (DFS) traversal is followed.

NOTE: As this phase is concurrent, so the app is running again and it can again allocate new objects!
Which means,
Step1 and Step2 will again execute until and unless all of the reachable objects are identified and marked as true.

Step 3. Sweep Unreachable Objects:

  • The objects, which are not marked as reachable in the previous phase, are considered as “unreachable” objects and they are swept from the heap, i.e. Garbage Collected!
Mark-and-Sweep DFS Traversal

Disadvantage:

  1. AsUnreferenced objects are not reclaimed immediately, the short-lived objects remain in memory a lot longer after they become unused.
  2. Pauses the app thread for its job. Too frequent pauses can result into frame drops and eventually the app will start to stutter!
  3. Fragmentation takes place, which can result into OutOfMemory error when a big object (like bitmaps) tries to allocate memory!

“Concurrent Copying GC”

Also known as CC and was introduced on Oreo.

Here comes a really big improvement on the GC. They rewrote the entire collector. The new collector was named “Concurrent Heap Compaction Collector”.

  • This new version comes with an “always-defragmentation strategy for the “heap compaction” problem.
  • They made the GC “heap compaction always necessary, even in the foreground.
  • Because of this, the GC can compact more often the heap of the applications that are always in foreground (i.e. System Services, Play Services, SystemUI, etc.) along with the apps in background too.
    According to Google, this brings about a 30% savings on overall memory requirements.
  • This process is “Concurrent”, so the app threads are not paused!
  • GC divided the heap into “buckets” (e.g. 256k buckets to allocate objects). When a thread needs memory, the GC gives that thread a “bucket” to allocate the objects locally, instead of locking the entire system.
  • When a bucket gets defragmented, if it is less than 70% or 75% utilization, that bucket will get collected and the objects in there will be moved to another bucket, emptying the first one.
  • They also introduced: “bump-the-pointer technique for allocating objects
  • With this pointer technique, the only thing that is needed to allocate a new object is to move a pointer in a certain bucket.
    When a bucket is given to a thread, it looks up if the new object fits in the remaining space between the pointer and the end of the bucket.
    If yes, the new object is allocated and the pointer updated with the address of the new object.

This technique brings a marvellous 18 times faster allocation time than Dalvik and 70% faster than Nougat.

Disadvantage:

  • Removal of most famous “Generational GC”.

But the ART Team realized that and then they re-introduced “Generational GC” as an extension of “Concurrent Copying”, as part of Android 10.

Generational GC”

Introduced on : Lollipop, Marshmallow
Re-Introduced on: Android 10, as an extension of “Concurrent Copying” GC

This also runs on “Concurrent Mark-and-Sweepalgorithm (like before), but this time, with some added advantage, surprise surprise: “Generations”! 😬

To understand this, first we need to understand the heap structure.

We all know that, the “objects” live inside the Heap while “reference” and “variables” live in the Stack.

The ART (Android Runtime) GC basically scans the heap and analyzes the object. If it finds unreferenced objects, they will be marked for GC.

However, frequent garbage collection consumes lot of CPU and it will pause the app — resulting in frame drops, which can cause app stuttering, juddering or janky app.

By the way, if you want to know more about dropping/skipping frames in Android, please check this tutorial. 🙂

To overcome this, Heaps are broken down into 3 categories, namely:

  1. Young Generation (further divided into 3 categories: Eden, S0 and S1)
  2. Old Generation
  3. Permanent Generation

Each object inside the Heap has an “age”, (can be thought of as an integer value) which increases during the course of certain operations.

  • Young Generation ➞ contains freshly allocated “short-lived” objects.
    The GC happens here is known as Minor GC.
  • Old Generation ➞ contains “older” objects, that has lived for a longer period of time. (For example, Application class or Repositories).
    The GC happens here is known as Major GC.
  • Permanent Generation ➞ consists of metadata required for the runtime environment (such as, JVM, ART etc.) which is used to describe classes and methods. Classes that are no longer in use maybe “Garbage Collected” from this generation. We can use the -XX:PermGen and -XX:MaxPermGen flags to set the initial and maximum size of the “Permanent Generation”.

Let’s now try to understand the symmetrical rhythm of this process with a simple diagram:

  • All new allocations happen first within the Young Generation part of the heap (inside “Eden” to be precise).
  • At this point, the “S0” (Survivor 0) and “S1” (Survivor 1) stay empty!
  • Once “Eden” is filled with objects completely, a minor GC happens.
  • During this GC, all the unreferenced objects are released from the heap.
  • Only the referenced objects survive in this phase and are moved into S0.
  • Eden gets empty!
  • After some time, Eden gets filled up with the new allocations again.
  • GC takes charge!
  • Like before, all the unreferenced objects are released from the heap.
  • Survivors, however, are now moved into S1 (instead of S0) and the age (that integer value) of S0 objects are now “incremented” and then pushed into S1.
  • So now, Eden and S0 both are empty and only S1 has the referenced objects.
  • After some time, Eden gets filled up with the new allocations again.
  • GC shows up!
  • All the unreferenced objects are again released from the heap.
  • Survivors, however are now moved into S0 (instead of S1) and the age of S1 objects are “incremented” and pushed into S0.
  • Now both Eden and S1 are empty and S0 is having all the referenced objects.
  • This entire process will happen again and again.
  • If any of the referenced objects reach a certain age, then that specific object will get promoted to the “Old Generation”.
  • Then the “Old Generation” block gets filled up with the objects.
  • This time Major GC comes to rescue!

This symmetrical flow (if we look closely to this algorithm, we will indeed find a symmetry) of GC , is known as “Generational Garbage Collection”!

3. What happens when it fails?

Some objects won’t be garbage collected, if they holds a strong reference even after they have lived their life.
These type of survivors start to pile up pretty soon on the Heap memory.
And eventually, at some point, when a new object wants to allocate memory, it won’t be able to do that because there is no memory left for allocation! As a consequence of that, the app will be crashed with “OutOfMemoryError”!

This phenomena is known as Memory Leak!
Memory leaks are dangerous — even if it is a very small one!

Remember?

“A small leak will sink a great ship. — Benjamin Franklin

4. Who is eligible for GC?

To determine the eligibility criteria, we need to understand a very simple question:

“Is the object REACHABLE?”

Yes, it’s high time to understand “Object Reachability"! Finally!!

Now, what does this term REACHABLE stands for?

Well, Object Reachability can be thought of as a series of 3 simple questions:

  1. Is the object a ROOT?
  2. Does the object has a REFERENCE?
  3. Is any of the Parent object REACHABLE?
    (This is a recursive call on the parent)

Let’s try to understand this with a flowchart diagram:

From this flowchart diagram we can clearly see, the objects which are not reachable (marked in RED), is eligible for Garbage Collection!

In Application Memory Graph it looks like the following:

Green Circles in this diagram are considered as ROOTs.

So, what are the Roots?

These are the objects that basically hold other objects in memory.

Important ROOTs in Android App can be:

  • Objects referenced from static fields
  • Instances or custom subclass of Application class
  • Live Threads

Note: “Live Listeners”, “Live RxJava Subscriptions”, “Live Coroutines Jobs” can be built on top of other ROOTs.

Garbage Collector Operation Example:

Let's consider, We have instantiated 2 objects in our MainActivity class. Let’s call them MyObject1 and MyObject2.Also considering, 1. :MyObject2 is referenced by :MyObject1 and 
2. :MyObject1 is referenced by :MainActivity

Pictorial Representation:

Is :MyObject2 reachable?

  1. Is :MyObject2 a root? ➞ NO
  2. Is :MyObject2 referenced? ➞ YES

Is :MyObject1 reachable?

  1. Is :MyObject1 a root? ➞ NO
  2. Is :MyObject1 referenced? ➞ YES

So basically, :MyObject1 and :MyObject2 are REACHABLE if :MainActivity is REACHABLE (Transitivity).

Is :MainActivity reachable? YES

Activity objects are considered REACHABLE until onDestroy() lifecycle callback is invoked by Android.

Bonus tip : 😎

Special Edge case : Circular Reference

Here two objects are pointing to one another forming a Circular Dependency.
If the GC algorithm was totally naive, it would have got into an infinite loop between these two objects.
But you know what? GC is smart enough to understand this scenario and it will search for any other references; if there are no other references to either of the objects, both will be considered as unreachable and will be eligible for GC!

GC can handle circular references involving many more objects. So regardless of how many objects are there in the circle of chain, GC will understand that these objects basically point to one another and if there are no external references to that group of objects, all of them will be considered unreachable and their memory will be reclaimed!

Not bad huhh!!

That’s all for now!
If you have made it till the finish line, I hope you’ve enjoyed reading every bit of it!

If you’d like to see more content like this, please follow me and leave a clap 👏 for me. This will help motivate me writing more powerful contents!

Stay tuned! 🙂

--

--