Skip Lists: A Randomized Data Structure for search and update

Published in

Techspace

6 min readSep 6, 2019

We already know that a sorted array will allow Ο(log n) time searches via the Binary Search algorithm. Unfortunately, update operations in a sorted array are computationally expensive i.e. O(n) because of the need to shift elements, while deleting or inserting a particular element in the array.

Linked List might help with these update operations as long as the positions of the elements is known in the list. With Linked List we can achieve update operations in constant amount of time just by manipulating pointers.

But, there is a problem with Linked List(as you might have already anticipated)..

We can’t perform Binary Search in a Linked List due to inability of direct accessing any element within the linked list. I mean we cannot jump to any index in a constant time, instead we have to traverse the Linked List to reach the desired position.

Now, you have understood the problem, Let’s find out the solution for it.

The Data Structure we are looking for, as our solution is Skip List.

Skip List is a clever compromise(in terms of space complexity) to efficiently support search and update operations. A skip list of the given items is a series of linked lists {S₀, S₁, …., Sₕ}. Each list Sᵢ stores a subset of the given items in increasing order, plus items with two sentinel values -∞ and +∞. List S₀ contains all the elements given and Sₕ only contains sentinel values. For i = 1, 2, 3, …, h-1, list Sᵢ contains(in addition to -∞ and +∞) a randomly generated subset of the items in list Sⱼ where j = i -1.

example of a skip list storing 10 elements.

Previously, I told you that we will be generating Sᵢ based on the randomly picking the elements of Sⱼ (j = i-1).

So, let’s clear it out..

We flip a coin, for each item in Sᵢ and place that item in Sₖ where k = i+1 if the coin comes up as “head”. Every item in the linked list is expected to be picked up with the probability of 1/2.

You might be thinking that’s a weird way to select an item for the list.. and yes you are right, but think for a second can you tell me a better way of choosing approximately half elements of the previous list to be put in the current list.

If we do that, we expect S₁ to have about n/2 elements, S₂ to have about n/4 elements and in general Sᵢ contains about n/2^i items.

A coin flip can be simulated with Python’s built-in pseudo-random number generator from the random module by calling randrange(2), which returns 0 or 1, each with probability 1/2.

I hope, you started to notice where we can introduce the ability of doing Binary Search within Skip list.

But, in case you don’t, I got you covered.

Before we start describing the methods for searching and insertion we need some helper functions:

next(p): Return the position following p on the same level.
prev(p): Return the position preceding p on the same level.
below(p): Return the position below p in the same tower.
above(p): Return the position above p in the same tower.

here, a tower is a vertical linked list.

Searching in a Skip List (O(log n))

Suppose we are searching for element k. We begin the SkipSearch method by setting a position variable p to the topmost, left position in the skip list S, called the start position of S. That is, the start position is the position of Sₕ storing the special entry with value −∞. We then perform the following steps, where value(p) denotes the value of the element at p:
1. If S.below(p) is None, then the search terminates — we are at the bottom and have located the element in S with the largest value less than or equal to the element k. Otherwise, we drop down to the next lower level in the present tower by setting p = S.below(p).
2. Starting at position p, we move p forward until it is at the rightmost position on the present level such that value(p) ≤ k. We call this the scan forward step. Note that such a position always exists, since each level contains the values +∞ and −∞. It may be that p remains where it started after we perform such a forward scan for this level.
We repeat these steps again and again till S.below(p) is None.

Insertion in a Skip List (O(log n))

Suppose we want to insert an element k, first we call the function SkipSearch(k). This gives us the position p of the bottom-level item with the largest value less than or equal to k (note that p may hold the special element with value−∞). If value(p) = k, this is the trivial case and the value is overwritten with itself. Otherwise, we need to create a new tower for element k. We insert k immediately after position p within S₀. After inserting the new element at the bottom level, we use randomization to decide the height of the tower for the new element. We “flip” a coin, and if the flip comes up tails, then we stop here. Else (the flip comes up heads), we backtrack to the previous (next higher) level and insert k in this level at the appropriate position. We again flip a coin; if it comes up heads, we go to the next higher level and repeat. Thus, we continue to insert the new element k in lists until we finally get a flip that comes up tails.
We link together all the references to the new element k created in this process to create its tower.

The algorithm uses an insertAfterAbove(p, q, k ) method that inserts a position storing the element k after position p (on the same level as p) and above position q, returning the new position r (and setting internal references so that next, prev, above, and below methods will work correctly for p, q, and r). The expected running time of the insertion algorithm on a skip list with n entries is O(log n).

After reading about Insertion, you might think what happens if the coin always tossed “head” because in that case we have h = ∞, which is not feasible since we can’t have memory and computational capability for inserting infinite values.

Let me prove to you that probability of happening something like this is nearly impossible,

According to Warmup Lemma:

The number of levels in n element skip list is O(log n) with high probability. We are going to prove it,

The probability of getting i consecutive heads when flipping coin, is 1/2^i.

Therefore, the probability that level i has at least one position is at most:

Pᵢ ≤ n/2^i

The probability that the height h of S is larger than i is equal to the probability
that level i has at least one position, that is, it is no more than Pᵢ . This means that h is larger than say c.logn (where c > 1) with the probability at most,

Pₕ ≤ n/2^(c.logn)

Pₕ = 1/n^(c-1)

the probability that h is smaller than c log n is at least 1 − 1/n ^(c−1) .
Thus, with high probability, the height h of S is O(log n).

Space Complexity of Skip List

From above, we know that the number of positions at level i is n/2^i, which means that the expected total number of positions in S is:

Therefore the space complexity of skip list is O(n).