Java memory footprint — part 2

In part one we saw that Objects in general use much more memory than their primitive counterparts. Namely the overhead is more than 4 times the size of the payload (useful information represented as a primitive data type).

There are two more memory related issues, namely the footprint of most commonly used collection, and their resize strategy. Both this issues have practical usage and are worthy of attention.

Memory footprint of most commonly used collections

Let’s take a look at the java LinekdList class. How much memory does a LinkedList object use ?

LinkedList<Integer> l = new LinkedList<Integer>();

Without getting into details as much as we did in the first part, the LinkedList object will look similar to the image below:

From the memory footprint point of view, each LinkedList object will need memory to store the fields inherited from Object class, just like all other objects. Furthermore memory will also be allocated for an int field that stores the size of the list, and a reference to the first and last elements.

Also, for each value contained in the LinkedList, please note that there is the need of an internal object (namely a node) that contains the useful payload information, and the references to the prev and next list entries.

Here is the code for the Node class:

private static class Node<E> {
E item;
Node<E> next;
Node<E> prev;

Node(Node<E> prev, E element, Node<E> next) {
this.item = element;
this.next = next;
this.prev = prev;
}
}

To sum it up, in case of a LinkedList, on top of the actually useful memory (that stores the elements in the list), a lot of other objects will be allocated. These internal objects are needed for the LinkedList to function.

In the image above we see an estimation of the useful memory versus the overhead memory ratio. This ratio is roughly 1 : 1, so the larger the LinkedList the linearly larger the overhead.

A much simpler class is the ArrayList class:

ArrayList<Integer> l = new ArrayList<Integer>(50);

If we consider the inner workings of ArrayList, we will need memory to store the fields inherited from object, another 32 bit int to store the ArrayList size, and a reference to an array of Integers. The following image is a simplification of an ArrayList object’s memory footprint:

This class is much more generous with regards to its useful versus overhead memory usage. The memory used by its overhead is constant, while the more elements we add to it, the memory for actual useful data increases.

A similar case is the StringBuffer class:

Here we also have a constant memory for the overhead fields (inherited from Object, a reference to a char[]) while the useful payload memory increases linearly.

Last but not least let us take a look at the HashMap class.

HashMap<String, Integer> map = new HashMap<String, Integer>();

The HashMap object will need memory for the fields it inherits form the Object class, another float for the loadFactor, an int for the treshold, an int for the size, etc. On top of that, and array that stores the buckets, and for each bucket a list containing its entries. The approximation of this is in the image below:

While we do not want to get into the details of how a HashMap works, please note that for each key value entry there will be some overhead memory used, namely. Just for trivia, this is the code for the Node class, that implements Map.Entry, used by HashMap:

static class Node<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
V value;
Node<K,V> next;

Node(int hash, K key, V value, Node<K,V> next) {
this.hash = hash;
this.key = key;
this.value = value;
this.next = next;
}

public final K getKey() { return key; }
public final V getValue() { return value; }
public final String toString() { return key + "=" + value; }

public final int hashCode() {
return Objects.hashCode(key) ^ Objects.hashCode(value);
}

public final V setValue(V newValue) {
V oldValue = value;
value = newValue;
return oldValue;
}

public final boolean equals(Object o) {
if (o == this)
return true;
if (o instanceof Map.Entry) {
Map.Entry<?,?> e = (Map.Entry<?,?>)o;
if (Objects.equals(key, e.getKey()) &&
Objects.equals(value, e.getValue()))
return true;
}
            return false;
}
}

To sum up the HashMap memory usage, it is clear that the overhead scales linearly with the payload, namely that that the more entries we have the more overhead memory we use, in a roughly 2:1 ratio.

Collections resizing problem

What happens, memory wise, in the following lines of code ?

StringBuffer s = new StringBuffer();
s.append(“MY STRING”);
s.append(“ OF TEXT”);

In the first line, a new StringBuffer object gets instantiated. In the second line, the “MY STRING” string gets appended to it. An approximation of the memory footprint is in the image below:

The third line is a bit more tricky: as the default capacity of a StringBuffer is 16 characters, our object had enough space to append “MY STRING”, but it clearly has not enough capacity to also append the string “ OF TEXT”.

What will happen is that the StringBuffer’s object internal char array will get resized, namely its size will double from having space to store 16 chars to having enough space to store 32 chars. Our StringBuffer object, after line three, will look similar to the image below:

The default (initial) capacity of the StringBuffer class is 16. The resizing behavior of StringBuffer class is that it doubles its capacity every time it runs out of space.

In the table below we see the default capacity and the resizing factor for some of the most commonly used Java classes:

This behavior of certain classes will, if unchecked lead to the following situation: a large collection is almost completely filled and a couple of new elements are added to it. This will cause the collection to double its size, and while the first half of its memory will be filled with useful elements that are stored there, the second part of it will be using space but not storing almost any payload.

StringBuffer s = new StringBuffer();
s.append(“MY STRING”);
s.append(“ OF TEXT”);

For example, after the three lines of code above, the StringBuffer object will use space enough to store 32 characters but in fact will be storing only 17. Thus, space to store 15 characters will be wasted.

This is called the collection resizing problem, and as a solution, we can do the following: Whenever possible, use the constructor that takes the capacity as a parameter. Thus when we know the final storage capacity instantiate the collection to have that given capacity from the start.

Conclusions and best practices

  • If all other things are the same, use a collection that minimizes the overhead space
  • For small sized collections, if functionality is not the issue, use normal arrays instead
  • Avoid adding elements to the point your collections resize too many times
  • As much as this is possible, estimate the actually needed capacity of a given collection, and instantiate it to that capacity from start

In case you have not yet done so, please take a look at the first part of this article: Java memory footprint — part 1

Like what you read? Give Liviu Adrian Esanu a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.