Daily bit(e) of C++ | Linked lists

Šimon Tóth
7 min readMay 21

--

Daily bit(e) of C++ #140, C++ linked lists in interview problems.

Today we will take a short departure from the typical interview content schedule, and instead of an interview problem, we will go over a bit of theory and talk about linked lists.

While rare in practical applications, linked lists crop up frequently in interviews. Partly this is because the node structure lends itself to formulating tricky problems, similar to trees and graphs, without the added topological complexity.

std::list and std::forward_list

The standard library offers two list types, std::list — a doubly-linked list and std::forward_list — a singly-linked list. Forward list exists primarily as a space optimization, saving 8 bytes per element on 64-bit architectures.

Both offer perfect iterator and reference stability, i.e. the only operation that invalidates iterators or references is the erasure of an element, and only for the removed element. The stability does extend even to moving elements between lists.

#include <list>

std::list<int> first{1,2,3};
std::list<int> second{4,5,6};

// Get iterator to the element with value 2
auto it = std::next(first.begin());

// Move the element to the begining of the second list
second.splice(second.begin(), first, it);

// first == {1, 3}, second == {2,4,5,6}

// iterator still valid
// *it == 2

Open the example in Compiler Explorer.

The iterator stability is one of the use cases where we would use a std::list or std::forward_list in practical applications. The only reasonable alternative would be wrapping each element in a std::unique_ptr, which does offer reference stability (but not iterator stability) irrespective of the wrapping container.

#include <vector>
#include <memory>


std::vector<std::unique_ptr<int>> stable;
stable.push_back(std::make_unique<int>(1));
stable.push_back(std::make_unique<int>(2));
stable.push_back(std::make_unique<int>(3));

// get a stable weak reference (or pointer) to an element
int *it = stable[1].get();
stable.erase(stable.begin()); // invalidates all iterators
// it still valid, *it == 2

Open the example in Compiler Explorer.

Of course, we do pay for this stability with performance. Linked lists are node-based containers, meaning each element is allocated in a separate node, potentially very distant from each other in memory. When we combine this with the inherent overhead of the indirection, traversing a std::list can regularly end up 5x-10x slower than an equivalent flat std::vector.

Aside from iterator stability, we also get access to a suite of O(1) operations, and these can potentially outweigh the inherent overhead of a std::list.

#include <list>

std::list<int> data{1,2,3,4,5};

// O(1) splicing between lists, or within one list

// effectively rotate left by one element
data.splice(data.end(), data, data.begin());
// data == {2,3,4,5,1}

// O(1) erase

// iterator to element with value 4
auto it = std::next(data.begin(), 2);
data.erase(it);
// data == {2,3,5,1}

// O(1) insertion

// effectively push_front()
data.insert(data.begin(), 42);
// data = {42,2,3,5,1}

Open the example in Compiler Explorer.

We are locked out of some standard algorithms because std::list only provides bidirectional iterators and std::forward_list only forward iterators. Both lists expose custom implementations of sort, unique, merge, reverse, remove and remove_if as member functions.

#include <list>

std::list<int> data{1,2,3,4,5};

data.reverse();
// data = {5,4,3,2,1}

data.sort();
// data = {1,2,3,4,5}

data.remove_if([](int v) { return v % 2 == 0; });
// data == {1,3,5}

Open the example in Compiler Explorer.

The std::forward_list has an additional quirk; since we can only erase and insert after an iterator, the std::forward_list offers a modified interface.

#include <forward_list>

std::forward_list<int> data{1,2,3,4,5};

// before_begin() iterator
auto it = data.before_begin();

// insert and erase only possible after the iterator
data.insert_after(it, 42);
// data == {42,1,2,3,4,5}
data.erase_after(it);
// data == {1,2,3,4,5}

Open the example in Compiler Explorer.

Custom lists

When implementing a simple custom linked list, you might be tempted to use a straightforward implementation using a std::unique_ptr.

#include <memory>

struct Node {
int value;
std::unique_ptr<Node> next;
};

std::unique_ptr<Node> head = std::make_unique<Node>(20,nullptr);
head->next = std::make_unique<Node>(42,nullptr);
// head->value == 20
// head->next->value == 42

Open the example in Compiler Explorer.

Sadly, this approach isn’t usable. The fundamental problem here is the design. We are mixing ownership with structural information. In this case, this problem manifests during destruction. Because we have tied the ownership with the structure, the destruction of a list will be recursive, potentially leading to stack exhaustion and a crash.

#include <memory>

struct Node {
int value;
std::unique_ptr<Node> next;
};

{
std::unique_ptr<Node> head = std::make_unique<Node>(0,nullptr);
// Depending on the architecture/compiler, the specific number
// of elements we can handle without crash will differ.
Node* it = head.get();
for (int i = 0; i < 100000; ++i)
it = (it->next = std::make_unique<Node>(0,nullptr)).get();
} // BOOM

Open the example in Compiler Explorer.

If we desire both the O(1) operations and iterator stability, the only option is to rely on manual resource management (at which point we might as well use std::list or std::forward_list).

If we want to capture the structure of a linked list with reference stability, we can rely on the aforementioned std::vector and std::unique_ptr combination.

#include <vector>
#include <memory>

struct List {
struct Node {
int value;
Node* next;
};
Node *head = nullptr;
Node *new_after(Node* prev, int value) {
nodes_.push_back(std::make_unique<Node>(value, nullptr));
if (prev == nullptr)
return head = nodes_.back().get();
else
return prev->next = nodes_.back().get();
}
private:
std::vector<std::unique_ptr<Node>> nodes_;
};


List list;
auto it = list.new_after(nullptr, 1);
it = list.new_after(it, 2);
it = list.new_after(it, 3);

// list.head->value == 1
// list.head->next->value == 2
// list.head->next->next->value == 3

Open the example in Compiler Explorer.

The crucial difference from the previous approach is that the list owns all nodes, and the structure is encoded only using weak pointers.

Finally, if we do not require stable iterators or references but do require O(1) operations, we can use a flat list approach. We can store all nodes directly in a std::vector. The only problematic operation, in that case, is erase(), which is inherently linear for std::vector.

However, a linked list already inherently encodes the order of elements, so instead of erasing from the middle of a std::vector, we can always erase from the end by swapping the to-be-erased element with the last element in the std::vector.

#include <vector>

inline constexpr ptrdiff_t nill = -1;

struct List {
struct Node {
int value;
ptrdiff_t next;
ptrdiff_t prev;
};
ptrdiff_t new_after(ptrdiff_t prev, int value) {
storage.push_back({value, nill, prev});
if (prev != nill)
storage[prev].next = std::ssize(storage)-1;
else
head = std::ssize(storage)-1;
return std::ssize(storage)-1;
}
void erase(ptrdiff_t idx) {
// move head
if (idx == head)
head = storage[idx].next;
// unlink the erased element
if (storage[idx].next != nill)
storage[storage[idx].next].prev = storage[idx].prev;
if (storage[idx].prev != nill)
storage[storage[idx].prev].next = storage[idx].next;
// relink the last element
if (idx != std::ssize(storage)-1) {
if (storage.back().next != nill)
storage[storage.back().next].prev = idx;
if (storage.back().prev != nill)
storage[storage.back().prev].next = idx;
}
// swap and O(1) erase
std::swap(storage[idx],storage.back());
storage.pop_back();
}
ptrdiff_t get_head() { return head; }
Node& at(ptrdiff_t idx) { return storage[idx]; }
private:
ptrdiff_t head = nill;
std::vector<Node> storage;
};


List list;
ptrdiff_t idx = list.new_after(nill, 1);
idx = list.new_after(idx, 2);
idx = list.new_after(idx, 3);
idx = list.new_after(idx, 4);
idx = list.new_after(idx, 5);
// list == {1,2,3,4,5}

idx = list.get_head();
list.erase(idx);
// list == {2,3,4,5}

Open the example in Compiler Explorer.

Basic operations

When designing a solution for a coding problem, the three most frequent operations with linked lists are:

  • merge two sorted lists
  • reverse a sorted list
  • scan with two iterators

Both std::list and std::forward_list come with a built-in merge operation.

#include <list>
#include <forward_list>

{
std::list<int> left{2,4,5};
std::list<int> right{1,3,9};
left.merge(right);
// left == {1,2,3,4,5,9}
// right == {}
}

{
std::forward_list<int> left{2,4,5};
std::forward_list<int> right{1,3,9};
left.merge(right);
// left == {1,2,3,4,5,9}
// right == {}
}

Open the example in Compiler Explorer.

However, implementing one from scratch isn’t particularly complicated either. We consume the merged-in list, one element at a time, advancing the insertion position as needed.

#include <forward_list>

std::forward_list dst{1, 3, 5, 6};
std::forward_list src{2, 4, 7};

auto dst_it = dst.begin();

while (!src.empty()) {
if (std::next(dst_it) == dst.end() ||
*std::next(dst_it) >= src.front()) {
dst.splice_after(dst_it, src, src.before_begin());
} else {
++dst_it;
}
}
// dst == {1,2,3,4,5,6,7}
// src == {}

Open the example in Compiler Explorer.

The same situation applies to reversing a list. Both lists provide a built-in reverse operation with a custom implementation even simpler than merge.

#include <forward_list>


std::forward_list<int> src{1,2,3,4,5,6,7};
std::forward_list<int> dst;

while (!src.empty())
dst.splice_after(dst.before_begin(), src, src.before_begin());
// dst == {7,6,5,4,3,2,1}
// src == {}

dst.reverse();
// dst == {1,2,3,4,5,6,7}

Open the example in Compiler Explorer.

Finally, scanning with two iterators is a common search technique for finding a sequence of elements that conform to a particular property. As long as this property is calculated strictly from elements entering and leaving the sequence, we do not need to access the elements currently in the sequence.

#include <forward_list>

std::forward_list<int> data{4,2,1,1,1,3,5};

// find the longest subsequence with sum less than 4

// two iterators denoting the sequence [left, right)
auto left = data.begin();
auto right = data.begin();
int sum = 0;
int len = 0;
int max = 0;

while (right != data.end()) {
// extend right, until we break the property
while (sum < 4 && right != data.end()) {
max = std::max(max, len);
++len;
sum += *right;
++right;
}
// shrink from left, until the property is restored
while (sum >= 4 && left != right) {
sum -= *left;
--len;
++left;
}
}
// max == 3, i.e. {1,1,1}

Open the example in Compiler Explorer.

--

--

Šimon Tóth

I'm an ex-Software Engineer and ex-Researcher focusing on providing free educational content.