What This Guide Is
This guide is intended as a primer and refresher on the linked list data structure. It will cover the structure itself, its computational timings, its common algorithms, and how to code up its various permutations.
What Is a Linked List?
Linked lists are considered by many to be the most basic data structure, and it’s easy to see why. A linked list comprises a series of nodes, each of which contains a pointer to the next node in the series. This typically culminates in a node which points to
null, signifying the end of the list. Visually, it’s something like this:
The first mistake that many people make when looking at a linked list is to assume it’s a weird implementation of an array. While the two structure have many superficial similarities, those similarities are just that: superficial. The two are distinct ways of storing related data.
Let’s examine what makes a linked list different from an array, namely the method of storage in memory.
Linked lists, on the other hand, can be spread about in memory. While this makes accessing a specific node harder, it makes creating new nodes and removing old ones much faster, as they can be created anywhere in memory. The
next attribute of a linked list node contains a pointer to the memory wherein the next node is stored.
If you were having trouble differentiating linked lists from arrays, this should help. You’ll notice right off the bat that arrays and linked lists are polar opposites in terms of timing. The reasoning is simple: because linked lists are not fixed to a single location in memory, they can very easily insert new content at the front (or head) of the list. However, appending to the end of the list requires traversing the entire list, as a basic linked list does not store how many elements it contains.
NOTE: other types of linked list do store this information. See Linked List With Tail and Size toward the end of this guide for more information.
The same is true of reading: accessing the Nth element in a linked list requires traversing N elements.
With an array it is easy to access the end of the list, but manipulating the middle or the beginning is complicated. Because it takes up a specific space in memory, in order to insert or remove from anywhere other than the front of the list requires shifting the entire rest of the list forward. If you insert new data at index 1, you need to move what used to be there to index 2. You then need to adjust index 2 forward to 3, and so on and so forth until the end of the array.
In linked lists, the old node can simply be removed, like so:
The pointer from
node 1 to
node 2 is overwritten to point to
node 3 instead. Nothing needs to move, because the data’s storage location in memory is irrelevant.
Under What Circumstances Is a Linked List Useful?
The answer is: lots of places! Linked lists being growable without significant impact to memory makes them very useful structures. A few places they shine are:
- Stacks and Queues: because these two data structures don’t care about data in the middle of an array, but only at the ends, linked lists can be used to make extremely efficient version of both.
- Any instance where only adjacent nodes are relevant: imagine a slideshow. You have a next button, and a previous button. A linked list would be a perfectly efficient way of storing this information, as when you will only every need to fetch the adjacent piece of data, which can be done very quickly. The same can be said of music players, or web browser history.
- You want an array, but plan to do a lot of inserting and not a lot of reading: if you have knowledge of the application of a data structure before you begin, it can allow you to make far more optimal choices. In this case, if you know you need to store data, but will be inserting items into it more often than you need to extract a single item out of it, then a linked list might suit your needs beautifully.
As with many aspect of programming, the trick to knowing when to apply a structure lies in careful consideration of the situation in which it is being utilized, followed by an analysis of how the pros and cons of each structure can be made to suit your needs. Linked lists seem like a weird little structure, but they actually have many potential applications.
Working With Linked Lists
In order to work effectively with linked lists, there’s a couple algorithms (using the term loosely) that a programmer should keep in their back pocket. Let’s go over them.
Linked List Traversal
Linked lists are most commonly iterated over by use of a
while loop. Since they lack indexes,
for loops actually aren’t tremendously useful. Instead, we move over them by checking if the current node (or its
next pointer) is
Here’s a basic print function:
This function moves through a linked list, and prints the value at each position. The
read variable is repeatedly set to its own
next property, moving it down the line until it is equal to
null (and has therefore reached the end of the list). Note that we do not use the value for head, but instead assign a new variable; if we were to reassign
head, we could potentially lose it and destroy the entire list. Better by far to use a secondary variable.
Removing A Value
Let’s say you want to remove the value
2 from a linked list. You’ll need to iterate over the list, looking one node forward until you locate the node containing
2. Then, you’ll want to stitch the current node to the node two ahead of it (or null, as the case may be), as illustrated in Fig 3.
Lines 2–8 cover edge cases. The first is checking to make sure the list contains anything. If it doesn’t we return
-1, the sign that we didn’t find the value in question. The second check ensures the value we are trying to remove isn’t the value of the head itself. If so, we simply delete the current head by setting
The rest utilizes a basic traversal, as seen in Fig 4, only instead of looking at the current node, we look at the
next node. When we find our target, we set
read.next.next, effectively skipping the node containing the value we wanted gone. We can be guaranteed that
read.next.next is a valid variable because our loop checks that
read.next is initialized.
Another common issue that can crop up in linked lists is a loop (or cycle). Consider the following:
This, obviously, is a problem. If we are using a check against
null to determine whether to stop a loop, this list would cause our previous functions to error out. While a well-made linked list won’t ever allow a coder to enter this state, it’s absolutely worth knowing how to check for.
The algorithm that is commonly employed is colloquially referred to as the tortoise and hare method of cycle detection. It checks for cycles by firing off two pointers, one of which moves at double the speed of the other. If they are ever pointing at the same thing, we know the list has a cycle. If, on the other hand, either of them ever reaches null, we know that the list terminates naturally.
Here’s an implementation:
The first thing we do for this algorithm is to check the first three positions for
null. Obviously, if any of them are
null, we know that our list terminates naturally. Then we set them off. We only check
.next for null, as
tort will be lagging behind; any value it hits will already have been checked by the
hare. If at any point they strictly equal one another, we know that we’ve found a cycle.
Programming a Linked List (& Permutations)
In this section we’ll go through three basic forms of linked list.
Singly Linked List
Every linked list starts with a node class. Here’s one:
This is pretty self explanatory. It is a small data structure that contains a constructor which takes a value and stores it in val. It also allocates space for a
Now that we have our Node class, we need our Linked List. A shell for our functions would look something like this:
All future code can be assumed to exist within this shell, in the area specified. There’s nothing remarkable here, save for the initialization of a head variable, which is currently set to
null(as the list is empty).
Let’s give this thing a little more grit by implementing
pop so we can edit the end of the list.
Let’s start by discussing
push. There’s a basic check at the front that looks to see if the list is empty; if so, we simply assign
this.head to a new Node with value of
val. Otherwise, we need to find the end of the list, then assign our new Node to the
.next of the previous final node.
pop is slightly more complicated. As with
push, we dedicate a bit of space to error checking. Lines 18–23 are all dedicated to closing out edge cases. Where it gets interesting is on line 27.
You’ll notice that this time we’re checking
read.next.next. This is because we want the second to last node. When
read.next.next === null, we’ll know that we’re at the second to last node in the list. Then, all we need to do is set
read.next to null, indicating that we’ve snipped off the last item of the list.
There’s a little additional logic here surrounding a return value, here called
ret. This is simply to mirror the behavior of an array. When a node is popped, we return its value. This behavior will exist in similar functions, and future iterations of this one, so don’t be alarmed.
unshift are much more succinct:
Because we already have access to
head, we know exactly where we’re putting the new node. In
shift we do a quick error check to make sure we aren’t shifting an empty list, then just set
unshift isn’t much more complex. We make a new node with its
next value set to
this.head. Then, we redeclare
this.head to be equal to our new node. Easy!
remove function utilized in Fig 5 can be used here, as well as the
contains. This function is very basic, and can be presented without further ado:
We use our basic while-loop-iteration, compare the value of each node to the value we received as an argument. If we find it, return true. Else, return false.
Linked List With Tail & Size
By this point you’re getting pretty familiar with the structure, and even have one of your own programmed out. You might be starting to wonder things like: why is accessing the end of the list so difficult? Why aren’t we tracking the size of the list?
The answer is simple: we weren’t because it was simpler. However, you can, and many people do! Very little in programming is set in stone. Making use-case modifications is often a useful, or even expected talent. So let’s take a look at what some of those functions look like if we try to implement a head and tail.
For starters, our constructor needs to change.
There. Now we initialize a linked list to have a size of zero, and a tail that’s equal to its initial head.
pop comes next.
push got much simpler! Aside from the edge case checking on lines 4–7, we can see that the whole operation can be done in 3 lines now, and in O(1) time. That’s a big boost! All we have to do is set
tail.next to a new node containing the value passed in, then increment up the size.
pop on the other hand got a bit gnarly. Lines 17–23 are all error checking (if you’re interested, you can see the complete code for the helper function
returnHeadAndClear()). What comes next seems overly complicated, though! Weren’t we storing the tail value now? What’s all this?
Well, in order to set the new tail, we need the second to last item in the linked list. This means traversing the whole list again, as our list contains pointers going in one direction. This is an issue which could be fixed, and will be addressed in the next section, Doubly Linked Lists.
The rest of the code is pretty similar, save for incrementing and decrementing the size counter. This new version seems great, but the inability to traverse in the other direction is vexing. Let’s look into fixing that.
Doubly Linked Lists
Previously we were coding singly linked lists. What is meant by this is that the pointers flow in one direction only. Nodes have a
next property, but not a
previous property. A list which flows in both directions can be said to be doubly linked.
Let’s look at the implementation.
To start, we have our two way Node. It has both a
next and a
previous property, allowing it to move in both directions. Let’s see what this does to our
Now this looks more like what we were hoping for.
push looks similar, although we now need to assign a
.previous value to the new node when we push it on.
pop on the other hand has become tiny. All it does is set the tail to the value of the previous node, then set that node’s next to
null. Exactly how we wanted to do it before, and now we can manipulate the tail of the list in O(1) time.
The other functions are still pretty unchanged, although they now feature some more upkeep to make sure their values of
previous are set properly. We can, however, now do fun things like printing the list backwards without it requiring nested loops:
Our linked list is now devilishly efficient for queues and stacks, and can be read forward and backwards. Not too shabby!
You should now have a very good idea of what a linked list is, and the ways you can expand upon it. If you wish to play with one that’s already written, check the sources for a link to my repository — it has a testing suite you can use in addition to the full code for all of these examples.