Do all data structures resemble a linked list?

Kelly Mejía
11 min readNov 20, 2020

--

This article elaborates on the reasons why linked lists are or are not the mother of all structures. Accordingly, it exposes why and why not every programming data structure resembles a linked list. The central hypothesis is that linked lists are the core structure from which all other data structures come from. Thus, if a person is able to understand and code linked lists, they will be able to understand the remaining data structures with ease.

What is a data structure?

First of all, let us define what a data structure is. While data refers to a value or collection of values which are meaningful, structure is the method for organizing the data. Therefore, a data structure is a method for storing meaningful values in a computer’s memory.

“Data structures are the building blocks of Computer Science” (Malhotra, 2019)

Figure 1: Data Structures representation

The most common uses of data structures are programs and softwares. A data structure should be efficient in order for a program to work in the best way possible. Efficient should be understood as the smart use of two resources: memory space and execution time. Also, a data structure can contain more than one data type. According to Aguilar & Martínez (2014), a data structure is a collection of one or more elements called members, which can be different in data type.

Monte Carlo Simulation of the price of a Stock. In this operation, more than 10,000 values where used repeatedly.
Figure 2: Monte Carlo Simulation of the future price of a stock. In this operation, more than 10,000 values were used repeatedly. This could not be possible without an efficient data structure.

Data structures are not only used for Computer Science means. A practical example of the usage and importance of efficient data structures in a different context is Finance. In such field, data structures become vital since financiers manage huge data sets of, for example, stock prices. If the data structures containing this values are not efficient, financiers wouldn’t be able to analyze the data and make decisions, which could ultimately lead to great monetary losses.

A data structure is sometimes defined as a logical or mathematical representation of the storage of data. It makes it easy to have a picture of the relationship that exists between its elements and how they are stored in memory efficiently. In this article, different types of data structures will be analyzed in order to determine their relation to linked lists.

Figure 3: Types of data structures. From Malhotra (2019)

What is a linked list?

A linked list is a non-primitive, linear data structure that saves data elements, which are not necessarily stored in contiguous, sequential slots of memory. Instead, the data is stored somewhere in memory and there are explicit links (pointers) to the memory slot where the data (following element) is stored. Linked lists use dynamic memory allocation: the elements are stored in memory slots at runtime.

“Series of discrete objects related to one another by memory pointers” (Smith, 2017).

Figure 4: Basic parts of a linked list

According to Smith (2017), a linked list is a sequence of elements that are known as nodes. Each node contains a link to the next node. The links are implemented as pointers. An initial pointer exists in memory to identify the first node, which is commonly referred to as head.

With linked lists, the insertion and deletion operations become very simple. Actually, insertion and deletion operations have a O(1) complexity because it is only necessary to modify the pointers. Thus, a linked list is extremely efficient at growing or shrinking, but not so much at searching or accessing a particular element. These operations have a O(N) complexity since the whole structure has to be traversed until the sought-after element is found.

Figure 5: Insertion of a new node in a linked list

Below is a simple implementation of a Linked List in C++:

LinkedList<string> myLinkedList = new LinkedList<string>();LinkedList<T> myLinkedList = new LinkedList<T>();

Practical Case: Bike route App

Imagine you get hired to build a mobile app to help skilled bike users track their routes on their smartphones. One of the basic requirements of the app is the ability to store waypoints in a route. This could be easily done with a linked list structure since a route has the basic characteristics of a linked list: it has a beginning and multiple nodes pointing to the next one.

Why do all structures resemble a linked list?

There are several reasons to believe all data structures resemble a linked list. First, all data structures are abstract except for vectors. This means that although a visual representation can be made, it will not necessarily hold in memory. Second, numerous data structures use basically the same arrangement of a linked list: of a collection of nodes containing data and a pointer aiming towards the next node in the list. Such data structures are trees, stacks, heaps, queues, circular linked-lists and doubly linked lists.

However, the aforementioned data structures are not the only existing ones. Even if they were, all these structures lie at the same level as linked lists, just as denoted in figure 3. Consequently, there is no documented evidence that shows that any other data structure proceeds from linked lists.

Nevertheless, doubly linked lists, stacks, queues, trees, and heaps have a great likeness with linked lists. In the following sections, such data structures are analyzed in order to figure out whether that is the case or not and the underlying reasons.

Doubly linked lists

A doubly linked list is a structure composed of nodes and pointers in which every node points to the next node and to the previous node. They are like two-way streets. Since the previous node to the head (first node) does not exist, the previous pointer of the first node of the list points to null. Similarly, the next link of the tail (last node) points to NULL. Doubly linked lists can be traversed in reverse.

Figure 6: Memory representation of a doubly linked list and its elements (nil = NULL)

The similarity of doubly linked lists and linked lists is obvious. A double linked list is a linked list that is not only linked forwards but also backwards. In fact, doubly linked lists have the same complexity for insertion and deletion as singly linked lists: O(1). Also, accessing and searching for a specific value have the same complexity in both structures: O(N). As both structures are highly similar, the knowledge of a linked list is enough to understand and implement doubly linked lists.

Stack

A stack is a non-primitive, linear data structure that stores elements based on the LIFO principle (Last In, First Out). Since the last elements that are included in the stack are the first to get out, deletion and insertion take place in the top position, thus having an O(1) complexity. The access and search operations have a O(N) complexity since the stack has to be traversed entirely in order to find or access a specific node, just as with linked lists. The most characteristic operations of a stack are push (add elements) and pop (remove elements).

Figure 7: Characteristic stack operations

A stack can be very useful to backtrack a series of operations. An example for this would be Undo in Microsoft Software, where you hit Ctrl + Z and the last operation you did is undone. On top of that, stacks can also be used to store the navigation history of a browser.

A stack can be implemented as a linked list or as an array. It is considered that a linked-list-based stack will be more efficient for sorting, insertion and removing operations. Therefore, having knowledge of the implementation and logic of a linked list will make anyone able to implement their own stack.

Queues

A queue is a non-primitive, linear data structure characterized by storing elements using the FIFO principle (First In, First Out). In real life, a queue can be seen in stores, banks or any other place where people get on a line in order to get a service. The first person in the line will be the first one to be assisted. New elements are added in one end (the rear or tail) and removed from the other end (the front or head). Such operations are called enqueue and dequeue.

Figure 8: Queue operations

A queue is similar to a stack. However, they differ in storing principle: while a stack is a LIFO structure, a queue is a FIFO structure. Hence, a queue is open from both ends whereas a stack in only open in one end.

Figure 9: Queue representation

A queue can be implemented as an array or as a linked list. The implementation as an array may be simpler, but is limited by the size of the array. With linked lists, size is not a problem due to dynamic memory allocation. Also, queues implemented as linked lists have a O(N) complexity for access and and search operations, and a O(1) complexity for insertion and deletion.

A queue is an abstract data structure because the memory slots in which elements are stored will not necessarily be contiguous. Thus, they can be seen as a linked list with special properties of insertion and deletion. Knowledge of linked list will therefore be vital in the implementation of a queue.

Trees

Trees are non-primitive, non-linear data structures which store data in nodes arranged in a hierarchical manner. The topmost node is called root. The root can have pointers to any number of nodes (children), but no other node below the root points to it. Every child can become the root of a new subtree. Trees are recursive structures due to their hierarchical arrangement.

Figure 10: Tree Representation

The most common form of trees are binary search trees (BST). Being the simplest form of trees, BST have 2 or less children at every level. That is, a BST has a root node, a left subtree, and a right subtree. Since trees are recursive, this statement holds for each and every subtree.

The advantage of trees is that the complexity for the insertion, deletion, search and access operations is O(log(N)). Hence, they are easier to manipulate than linked lists, being this the first difference between such structures. Although both a linked list and a tree share the usage of nodes, those of a tree have 3 elements instead of 2: left pointer, value, and right pointer. However, both share a starting point: the head of a linked list is the equivalent of a root of a tree.

Figure 11: Illustration of trees

Since there is no way of assuring that the memory slots assigned to tree nodes follows a hierarchical pattern, trees are abstract and can be implemented using knowledge of linked lists. However, a greater understanding of the particular programming language being used is needed because some relevant adjustments are needed in order to craft a tree out of a linked list, such as attaching two pointers to the same node.

Heap

A heap is just a binary tree with 2 characteristics: 1) it has to be completely full at all levels, and 2) the value of each node must be equal to or greater than the data stored in each one of its children. Thus, the root is the largest value of the heap. The last or deepest levels can be an exception, but will be filled from left to right until the level is full.

Its main use is as an implementation structure for a priority queue. A priority queue is an abstract data structure that has two main functions: insertion and deletion of the item with greatest key. The main operation of a heap is heapify.

Figure 12: Only (a) is a heap. (b) is not a heap because it is not full, and ( c ) does not follow the second property of heaps.

A heap can be implemented as a linked list. So, as mentioned before, knowledge of linked lists will come handy at the moment of implementing trees such as heaps.

Exception: Arrays

An array is a non-primitive, linear data structure. It is the mathematical concept of a vector and a matrix. An array is a sequence of data of the same type (it is a homogeneous structure). It has a user-defined number of slots (it is static) which receive an sub index as name.

Thus, if an integer array called A is of size 10, it will have 10 slots to store 10 different integers. These integers can be accessed by calling the name of the array and the particular sub index of the slot. It is important to mention that in many programming languages such as C, C++, Java, and Python, sub indices start at 0, while at some others they start at 1, such as R.

A[5] = 19
Figure 13: Array A[] of size 10

Therefore, accessing an element of an array is O(1). However, insertion, deletion, and search operations have a O(N) complexity. Since the size of the array is defined by the user in the code, the slots are stored in memory next to each other, thus making the array size unchangeable. This is a clear disadvantage of arrays compared to linked lists.

Figure 14: Array representation

Arrays do not resemble linked lists. In fact, arrays can also be a core data structure, since many other data structures can be implemented as arrays. Hence, arrays are a clear exception to the hypothesis.

Linked lists and arrays: both predecessors

In conclusion, data structures resemble linked lists because they can be implemented as such. However, using a linked list structure is not the only way of implementing some data structures (such as queues, stacks, and heaps). An array-based implementation is also plausible. Saying that every abstract data structure can be implemented using linked lists is ambitiously broad. Also, this assumption can be logically wrong since it generalizes a particular idea.

Therefore, linked lists constitute one of the pillars of data structures but are not the only predecessor of every single data structure. Both linked lists and arrays are the base of abstract data structures. In this way, any person who thoroughly comprehends linked lists and is able to implement them in any programming language, will be able to understand and implement other data structure.

As a result, linked lists are vital for understanding and implementing other abstract, non-primitive data structures. Hence the fundamental importance of linked lists to Computer Science.

References

Dale, N. B. (2003). C++ Plus Data Structures: Vol. 3rd ed. Jones & Bartlett Learning.

Joyanes Aguilar, L. & Zahonero Martínez, I. (2014). Programación en C, C++, Java y UML. 2ª edición. McGraw Hill Education.

Malhotra, D., & Malhotra, N. (2019). Data Structures and Program Design Using C : A Self-Teaching Introduction. Mercury Learning & Information.

Smith, W. (2017). Everyday Data Structures. Packt Publishing.

--

--