Skip List

What is a skip list?

5 min readJun 25, 2018

Skip list is a quite new data structure, which was first described in 1989 by William Pugh. Simply speaking, skip list is a data structure that allows fast search within an ordered sequence of elements. The base level of skip list maintains all elements, and each higher level will have less elements.

In general, skip list has following advantages:

Skip list performs well on rapid insertions.
Easy to implement.
It allows you to retrieve the next element in constant time.
This data structure can be modified to some other data structures, such as trees, indexable skip lists, etc.

An simple skip list

A skip list consists of multiple linked lists. Search could take O(n) for both sorted and unsorted linked lists. Will skip list do a better job? Well, before analyzing time complexity of search operation, let us see how an simple skip list looks like.

In this example, there are two ordered linked lists. The base one , L₀ contains all elements. The upper layper L₁ contains four elements and each element in L₁ has a pointer, which points to the element of L₀ with the same value.

Search(x)

Before searching for an element in a skip list, we could convert it to a binary search tree by rotating and removing duplicate edges to get rid of circles.

This is just like a normal binary search tree, and search could be performed from the root. The only issue is that there are duplicate nodes, the duplicate node always go to the left subtree in this case.

Now, let us look at how a search operation is done on a skip list.

In Fig. 3, the red number represents the order of all steps required by one search operation. We always start from the first element in the highest layer, which is 14 of L₁. If searching for target 66, since 14 is less 66 and next node 34 is also less than 66, we go to node 34 in L₁. At node 34, the next node 42 is still less than 66, so we move to 42 in L₁. Now, we find out that next node 79 is larger than our target, which means we are going to overshoot if still searching in L₁. So we move node 42 of L₀. By taking similar actions, our algorithm will find target 66 in L₀. The search operation in a two level skip list could be summarized in following steps:

Walk right in the top level linked list L₁ until next node is larger than (or equal to) our target.
Walk down to L₀ and walk right in L₀ until the element is found or not.

Time complexity if search(x)

As we mentioned earlier, skip list is a data structure that allows fast search within an ordered sequence of elements. In our previous example in Fig. 3, assuming that the length of L₁ is |L₁| and length of L₀ is |L₀|, then we can say that the search operation approximately take |L₁|+|L₀|/|L₁|. Imagine each node of L₁ skips the same number of nodes of L₀, so that each node of L₁ skips |L₀|/|L₁| nodes. The worst scenario is moving down to L₀ aftering checking the last node in L₁ and find the target in L₀ at the last position. The worst time cost is |L₁|+|L₀|/|L₁|, which is minimized when |L₁|=|L₀|/|L₁|. The minimal time cost of worst case is 2∗√N, N is the length of |L₁|.

What if there are more than two sorted lists? Assume that we construct all higher layer lists based on the same relationship: |Lᵤ|=|Lᵥ|/|Lᵤ| for all possible u=v+1.

If there are 3 sorted lists, the time cost for search is: 3∗∛N

If there are 4 sorted lists, the time cost for search is: 4∗∜N

…

If there are n sorted lists, the time cost for search is: ⁿ√N

…

If there are ㏒(N) sorted lists, the time cost for search is: ㏒(N)∗㏒(N)th root of N, which is 2∗㏒(N). Skip list does do a better job than a normal linked list in search.

Insert(X)

Currently, our ㏒(N) time complexity for search is achieved based on a static skip list. If we insert or delete some nodes, the whole structure of skip list might be broken and ㏒(N) might go away. How do we insert(x) or delete(x) properly? Deletion is much simpler than insertion. When deleting an element, you first search for x. Once you find x, then you delete x in current and lower level of lists. On average, deletion has the same time complexity with search, which is ㏒(N). Insertion might be more complicated and more interesting. Let us talk about it in detail here.

This skip list is a little more complicated, which consists of 4 sorted lists.

The Pseudocode code for insert(x):

To insert an element x into a skip list, search(x) first to see where x fits into the bottom list. (Always insert into bottom list).
We will insert x into some of the upper level lists by flipping a fair coin. while getting head: Promote x to next level up

Take insert(67) as a running example. 67 will be certainly inserted into the base list. And from any level up, we are going to flip a fair coin to decide if we want to promote 67 to upper level. If we get head, we insert 67 to upper level. If we get tail, we stop the algorithm. In this specific example, we are so luckly that we have 4 heads, then we add 67 to 4 upper lists.

Disadvantages of skip list

Skip list does do a better than in search compared to linked list. But an obvious downside is taking up to0 much space. Let’s assume there are N elements, then the heighest node is approximately n=㏒(N), that gives us approximately 1 + ∑ⁿₒ 2⁻ⁱ N ≈ 2N.

Another concern is that when you need to move down to the lower level by following a pointer, which possibly give you cache miss. An simple fix is that use a fix-sized array, which could better make use of locality.

Conclusion

Skip list is a very efficient data structure by applying small tweaks, compared to other data structures, such as Binary Search Tree, Linked list, etc. Depending on different problems we want to solve, we could apply minor tweaks to achieve significant speed-up.