Understanding and Implementing a Garbage Collector

Nitin Tulswani
4 min readSep 23, 2017

--

Introduction

Garbage collection is the process of recycling the dynamically allocated memory. It is performed by a garbage collector whose job is to recycle the memory which will never be used again (no object references or no reachability).

In high-level languages, like C, have memory management primitives, malloc() and free() which allocate the memory and release the allocated memory when it is not required anymore. But in JavaScript, memory allocation and deallocation is implicit.

Writing a garbage collector

There are various different ways of writing a garbage collector but we will use the simplest and the first algorithm ever invented by “John McCarthy” in 1958 as part of the implementation of Lisp.

The garbage collection algorithms rely on the conception of references. According to the Mozilla Developer Network documentation,

An object is said to reference another object if the former has an access to the latter (either implicitly or explicitly). For instance, a JavaScript object has a reference to its prototype (implicit reference) and to its properties values (explicit reference).

For the sake of simplicity, we will implement the basic concept of mark and sweep algorithm because the actual implementation involves a lot of optimisations.

Mark and Sweep Algorithm

The algorithm says that -

Starting at the root, traverse an entire object graph (objects are stored in heap) and mark the bits on it to 1 whenever we reach an object. After we’re done with the traversing, delete those objects whose bits are not marked to 1.

Now according to the algorithm, we’ve to start with the root. We’ll assume the root to be the first element of heap array. Garbage collector algorithms rely on notion of references or reachability, so to implement one we will need to follow all the reachable objects starting from the root.

Chain of objects

Let’s remove “C” reference from “A” and add another object “D”.

“C” reference is removed from “A” and is not reachable anymore (GC)

Now let’s remove “B” reference from “A”.

“B” reference is removed from “A”

It means that “D” still has a reference to “A” (via “B”) but it’s not reachable because “B” is not reachable anymore (remember the main notion is reachability and reference).

Let’s create a function that will traverse all the reachable objects starting from the root and will set the “__markBit__” bit on it to 1.

Mark() function

Now let’s create a function for sweep to move all the unmarked or unreachable objects to the free list.

Sweep() function

We have defined the process for mark and sweep, so now we’ll create our garbage collector function.

Garbage collector

And we’re done! Now let’s create one last function to execute the algorithm.

main() function

Output

Yeah! We did it 😎. Next time someone ask you to implement a garbage collector (mostly in interviews), you know how to do it 👍

This was possible only because of dmitrysoshnikov 👏👏 so don’t forget to follow him on Twitter. His repository is a gold mine if you love parsers, compilers and VM.

Full source code for the algorithm is available here.

Show off 😜

Hey if you are into React these days then check out my recent work. I’ve built a custom renderer for React which renders the React components to word documents.

Full source code and documentation is available here.

I’ve also built an image processing component for React. If you’re into shaders, webGL and image processing stuff, you’ll really like it 😄

Demo for react-imgpro

I’d love to hear your feedback ! I’m at @NTulswani on twitter so don’t forget to click on the follow button 😜.

Thanks for reading the article!

--

--

Nitin Tulswani

Focused on context sensitive information | Makes open source tools | Information design | Processing | Hello 👋🏽