Regardless of the language you use to write code, your program needs to allocate and access memory in order to store variables. In many high-level languages memory management is carried out for you, which makes for a simpler coding experience. Even if you’re using a high-level language it is still a good idea to have an understanding of the memory management process, and its potential pitfalls.
The memory lifecycle has three steps, common across most programming languages:
1. Memory Allocation
2. Using Variables
This stage of the memory lifecycle occurs when a variable that has been previously allocated memory is used by a program; for example its value is read or rewritten, the variable is passed to a function, etc.
3. Releasing Memory
Since there is no infallible method for determining when memory should be released, occasionally variables which are no longer needed by a program are not detected by the garbage collection algorithms and remain in memory (memory leaks).
Let’s look at how garbage collectors evaluate which objects can be removed from memory. There are two primary garbage collection algorithms:
This algorithm is the more naive of the two algorithms. In the reference-counting algorithm, an object is evaluated as ‘garbage’ and ready for collection if no other parts of the code reference it, either implicitly or explicitly. Some older browsers such as Internet Explorer 6 and 7 relied on this approach, and some garbage collectors still use this in conjunction with the mark and sweep approach.
This algorithm has one large drawback: circular references are not picked up by this method. If two variables reference each other but are not required in any other part of the code, this algorithm would not pick them up, as they are referenced and by the standards of this algorithm thus ‘needed’.
Mark and Sweep Algorithm
This more widely-used algorithm counts a variable as ready for collection if it is not connected to the global object. In the ‘marking’ part of this algorithm, the garbage collector visits all elements connected to the known root of a program (e.g. the DOM or the global object), and marks these elements as ‘reachable’, or ‘live’. It then recursively visits and ‘marks’ all elements connected to those live elements. This approach cuts out the problem of circular references; if an element is not connected to the global object, it will be not be marked as live, regardless of if it is referenced by other non-live elements.
In the ‘sweep’ phase the garbage collector clears the heap memory of all unmarked objects. This newly freed memory is then added to a list of available memory, and will be reallocated as new variables are created.
Historically while garbage collection was taking place all other processes were paused (known as a ‘stop-the-world’ approach), which means slow garbage collection can have a very real impact on performance. For example, in older browsers you can very clearly see video stutters (known as ‘jank’) where garbage collection delays loading video footage. The below video shows a side-by-side comparison of the same game demo being played in two different Chrome versions. The impact caused by an older garbage collection version delaying frame loading is surprisingly large!
More modern garbage collectors try to minimize rendering delays and latency in a variety of ways. To take V8’s garbage collector Orinoco as an example, there are three main ways the program works to do so.
In this approach all garbage collection is carried out by the main thread, but in intermittent stages. This way latency is reduced, even though the overall processing time is no shorter than with a traditional stop-the-world method. This is more complicated than a parallel collection, as between each incremental process memory can change, invalidating previous work carried out.
Accidental global variables
This type of memory leak can be avoided by being mindful of declaring global variables, or being sure to define all locally scoped variables. Using strict mode will also ensure that any undeclared variables are picked up in your code.
Let’s look at the setInterval( ) and setTimeout( ) methods. As long as they are active (which in the case of setInterval( ) can be until the program has completed running), the callbacks within them and the functions themselves cannot be marked as ready for garbage collection, regardless of whether variables referenced inside their callback functions are removed from scope.
In the above example, if we remove node and someExternalReference from scope, these will still point to each other, and so would not be picked up by the reference counting algorithm. This used to be a common source of memory leaks when this algorithm was still heavily used to free memory.
Whilst the introduction of the mark and sweep approach to garbage collection takes care of the circular reference issue, it remains best practise to cancel any attached event listeners before removing an element from your program.
Out of DOM references
Bear in mind that memory leaks of this type also affect the parent elements. For example, if you have a reference to a particular <li> element saved in your code and then attempt to remove the entire list, not only will the referenced <li> element be kept in memory but also the entire list. This happens as the DOM is doubly linked; all parents contain references to their children and vice versa. So if any element is attached to the global object, the entire tree of nodes will be prevented from being deallocated in memory.
There is a particular scenario with which closures can result in memory leaks, which can be a little confusing, but is worth being aware of.
Consider this example:
As a quick reminder, a closure is the combination of a function with references to its lexical environment. Every time replaceMyVar runs, a new uselessMethod closure is created. This shares a lexical environment with uselessFunc, which references previousVar. Once a variable is used by a closure, it is kept in the lexical environment of all closures which share the same scope. In this example, since uselessFunc references previousVar, this variable is also bound to the uselessMethod closure since they share the same scope.
Although uselessFunc is never called (the clue is in the name here!), this reference pointing to previousVar binds it to uselessMethod. So in effect, we end up with a chain of uselessMethods referencing the previous myVar (which contained a uselessMethod, which referenced the previous myVar… and so on) from every time replaceMyVar is called. This keeps the previousVar objects ‘active’ in memory, and so prevents them from being eligible for garbage collection. So every time the replaceMyVar function runs another very large string is allocated to memory and never deallocated, contributing to a significant memory leak. The easy way to reduce the impact of this would be to assign previousVar the value of null between lines 16 & 17, which would keep the closure in uselessMethod’s scope, but with a much smaller value attached to it.
- A more in-depth look into garbage collecting algorithms: https://www.educative.io/courses/a-quick-primer-on-garbage-collection-algorithms/jR2PP
- A deep dive into Orinoco’s processes (V8’s garbage collector) and memory allocation: https://v8.dev/blog/trash-talk