A discussion on Data Structures — Linked Lists.
In the first article of many, I explore Linked Lists and examine their structure, types and the various operations that can be performed on them.
Note: This article is part of a series of many articles focused on data structures and algorithms. See this for context:
A very general introduction.
What is a data structure?
A data structure is basically a group of data elements that are put together under one name, and which defines a particular way of storing and organizing data in a computer so that it can be used efficiently. — (Thareja, 2017)[1]
Primitive and Non-primitive
Data structures are generally organized into two types — primitive and non-primitive. Primitive data structures (or types) are the fundamental data types that we commonly see — integers, booleans, characters, etc. Non-primitive data structures are created by building upon the primitive data types. Examples include linked lists, stacks, queues, trees, graphs, etc.
Linear and Non-linear
Non-Primitive data structures are further organized into linear and non-linear data structures. Linear data structures store data in a linear (i.e. sequential) fashion. This linearity may be in terms of contiguous memory locations, or the linearity of links between memory locations.
One such linear data structure is the linked list…
Moving on to linked lists.
What is a linked list?
A linked list is a flexible data structure in which elements are stored in a chain-like structure. Each link in the chain corresponds to a node that wraps two things:
- A specific piece of information
- A reference to the next node (technically, the memory location of the next node).
This specific structure makes the linked list flexible in terms of how nodes are linked together. Regardless of where the individual nodes are stored in memory, the linearity of the list is always preserved and the elements of the list are traversed linearly. Secondly, we can identify different policies or techniques for insertion or deletion of information from the list and do not have to worry about the exact number of elements in the list, which may increase or decrease as per the needs of the programmer.
Per the figure above (forgive my fancy artwork), a head pointer points to the first node in the list. Each node in the list contains data and a next pointer. Jumping from node to node will eventually bring us to the tail of the linked list — the last node with a NULL reference. There’s nothing beyond this point except a dark, silent void.
The different kinds of linked lists.
The basic structure of linked lists proves to be highly extensible. As such, a few different variations of linked lists exist, differentiated based on their linkages (singly-linked/doubly-linked/circular linked lists and so on.).
- Singly-linked: as in Fig. 1 above.
- Doubly-linked: an inverse link also exists for each link. That is, for a link from node A to B, a backward link exists from B to A. This simplifies a lot of things. (Please don’t make me draw this.)
- Circular linked lists: a list where the tail node contains a reference to the first node. These lists may also be singly or doubly-linked! (I most definitely don’t want to draw this.)
Common operations on linked lists.
For simplicity’s sake (and also because I don’t want to end up writing a bazillion word article), I’m focusing on listing operations for singly-linked lists. It’s fairly simple to extend these operations for doubly-linked and/or circular linked lists (and I’ll also try and leave some references to catch up on).
Insertion
Strategies/policies to insert nodes include:
- Insertion at the start
- Insertion at the tail
- Insertion somewhere in the middle depending on either position or value.
In all cases except (1) above, the basic procedure remains the same. Traverse the linked list until we reach the desired position, then create a new node and initialize it with data. Finally, change the next node reference of the previous node in the list and point the new node to the existing successor.
In (1), we simply point the head pointer to the new node we want to insert and set the new node’s next node reference to point to the second node in the list.
To save yourself the jargon, take a look at the figure below.
Deletion
Just as in insertion, deletion can also be done in different ways, some of which are:
- Deleting the first node
- Deleting the last node
- Deleting the node at (or before, or after, you get it) a given position or value.
An important point of note is that in deletion, we need to store the previous node we visited. The reason for this becomes apparent in the following figure.
While the above cases and their descriptions are highly generalized, it is somewhat intuitive to come up with the logic for other types of insertion and deletion. Drawing figures to visualize each case certainly helps!
That’s pretty much it for linked lists. To understand how linked lists are realized in practice, take a look at the GitHub repository for this exercise. I’m trying to build a plug-and-play library for data structures and algorithms there!
Cheers,
Utkarsh.
Footnotes and References
References:
- Reema Thareja, 2017, Data Structures Using C, 2nd Ed., Oxford University Press.
To read more about linked lists, see:
To read about pointers (which are critical to the structure of non-primitive data structures and form the links or references to nodes), see: