Java Lists through Microscope

A closer look at Java’s List implementations

Murtuza Ranapur
The Startup
6 min readDec 20, 2019

--

Photo by Michael Longmire on Unsplash

Allow me to ask you one question, what’s your go-to List implementation? Let me guess, it starts with an ‘A’. ArrayList, correct? well, mine too. And it will be for a lot of other people. And why would that be? The answer to this is very simple, ArrayList is the first implementation that we learned or were taught, and we got stuck with it. To be honest, it took me some time to come along with other list implementations as I never felt the need of others and that’s why I didn’t even make an effort to search for them. ArrayList seemed good for all my use-cases. Well much about why ArrayList is mine and many others sweetheart, now let’s move on to business. But first a joke.

Pro Dude : You should try using Vector sometimes, it gives you thread safety
Noob Dude: But, Why would I use a vector if want to use a list?
Pro Dude : Well, Vector is a type of list
Noob Dude : Well, i don’t want to store co-ordinates

Good Ol’ array

Let’s take an example where we are using integer array to store marks of the students in a class to do some processing on it at a later point in your code

int [] marks = new int[5]//Currently class have 5 students
marks[0] = 98;
marks[1] = 99;
marks[2] = 95;
marks[3] = 97;
marks[4] = 93;//Wow that's some smart bunch of kids
//High scores raised many eyes and some papers were re-checked
//Some marks had to be updated
marks[2] = 75;
marks[3] = 84;

The array is initialized according to the strength of the class which is 5. We did a bunch of add operations and then we updated some values. Pretty straight forward right? Now let’s say 2 new students are added in the class. How do we go about adding 2 more? Well, there aren’t many things we can do here, our only option is to create a new array with the bigger size and copy the elements from the old one.

//We can achive this using Arrays.copyOf() function
int [] marksNew = Arrays.copyOf(marks, 7);
marksNew[5] = 65;
marksNew[6] = 87;

Wait, don’t we have to do it every time we need to add a new score? And doesn’t it increase the complexity of add operation to O(n)? Well yes, it does. But fear not, we already know a better solution, or do we?

Enter List, ArrayList

Now let’s do the same set of operations with ArrayList

List<Integer> marks = new ArrayList<>();//No need to specify size, but you can
marks.add(98);
marks.add(99);
marks.add(95);
marks.add(97);
marks.add(93);
//Let's update some marks
marks.set(2, 75);
marks.set(3, 84);
//Two new students added, no worries just add they
marks.add(65);
marks.add(87);

Very convenient isn’t it? We didn’t need to know the size of our data beforehand, didn’t need to maintain indexes of elements at least while we're adding them. All this convenience and we can still do our add operation at O(1). Wait, I am not sure about this O(1) part. Why don’t we take a closer look at the add method of the ArrayList?

It’s easy to guess from declaration at lines 1&2 that ArrayList is nothing but a wrapper over Object array. Let’s look at what add method is doing. Public add(E e) is calling parametrized add method with an array of current elements and size. Parametrized add method at line 4, first checks if the size of the list and elementData array is the same. If it is, then grow() method should be called. Let’s have look at grow() method.

So grow() function is calling parametrized grow function by passing the size incremented by 1. Now, look at line 2, freaking Arrays.copyOf(). In our example code, we also created a new array with size+1 using Arrays.copyOf() whenever we wanted to add a new element, but Java guys are doing things a bit differently, let’s see how. Notice the newCapacity() function call on the same line.

newCapacity() function increases the size of the array by half the current size. The capacity is increased in the following manner:

  • Given that, there are no elements in the array and we are trying to add an element in the array. The size of the array will be directly increased to 10 and then the new element will be added at index 0. So, basically this condition will be executed.
if (this.elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
return Math.max(10, minCapacity);
}
  • Next, when we have filled all the 10 positions and we are trying to add the 11th element, the size will be increased to 15. Here’s how:
//Let's say current value of oldCapacity is 10
int
oldCapacity = this.elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);
//>>(Right shift operator) by 1 divides the oldCapacity by 2, hence //the value of newCapacity is set to 15
  • As we keep on adding new elements the size of the array will keep on increasing until we finally reach 2147483647, which is basically Integer.MAX_VALUE. The rest of the condition’s present is the function makes sure we don’t go past this limit, and also gracefully handle integer overflows.

This approach is obviously better than a naive one, where we increase the size of the array on every addition. But even with this approach our worst-case complexity still stands at O(n). Well at least get operations are happening at O(1).

So, our approach was worse, Java’s approach was better but with the same worst-case complexity. The hope is not lost though, there is a way to achieve this at O(1). But first, let’s look at a similar implementation.

Vector is also a list implementation

The implementation of Vector is very similar to ArrayList. One difference is that all the operations of Vector are synchronized and another is in the way it increases the size.

The Vector’s newCapacity() method is very similar to ArrayList’s . It’s different only by how it calculates newCapacity variable.

//Let's say current value of oldCapacity is 10
int
newCapacity = oldCapacity + (this.capacityIncrement > 0 ? this.capacityIncrement : oldCapacity);
//If value of capacityIncrement is provided by the user
//let's say 4, then the newCapacity will be 14, else the size will
//be doubled i.e., newCapacity will become 20

Vector doubles in size each time it runs out of space. If this doesn’t satisfy your needs, Vector also allows user to pass the value by which it needs to increment. Though the calls to grow() will reduce, the worst-case complexity of add operation still stands at O(n).

LinkedList to the rescue

LinkedList is one more implementation of List, it makes use of doubly linked list to store values in the form of nodes. Node has three property one stores element itself, the second reference to the previous element and the third to next element. This makes it very easy to add and remove a new element. Let’s see some code.

As it is visible from the code, it is very straight forward to add a new element in the linked list. The code simply creates a node for a new element using its value, reference to previous and next node(Which is always null. It also marks the end of the list). Alas! we have found the implementation which performs add operation at O(1). But now our get operations will happen at O(n).

To reach the given index Linked list starts parsing either from start or end based on the index value. If the index is less then half the size then it starts from the start or else the end.

Conclusion

ArrayList and Vector add elements at O(n) but fetches element at O(1), while LinkedList adds elements at O(1) but fetches element at O(n). These data structures are meant to be used in different scenarios. Example, you can use:

  • ArrayList when use case involves more get operations the add/remove, such as managing students in a class, wherein addition of new students may happen rarely, while fetching may happen often.
  • LinkedList when you have more add/remove operations then fetch such as storing top 1000 logs in-memory.

There is a trade-off, there’s always a trade-off, always a price to pay. Moreover, the purpose of this article was not just to let you know of this trade-off but to have a closer look at the implementation itself. Often we know the fact about performance of a certain class or a library but never muster the courage to go an extra mile to look at code. But if you do, you may end up learning a lot. Are you still going to stick to ArrayList for all use-cases?

This is my first article so spare me for any mistakes. Just let me know in the comments if there are any.

--

--

Murtuza Ranapur
The Startup

Just a regular software Engineer passionate about Java and Functional Programming