You probably know what binary search is but WTF is Eytzinger order?
You might have heard of a data structure called binary heap. It is a binary tree represented as an array, where the parent is at position
i, the left child is at position
i*2+1 and the right child is at position
Eytzinger order is a binary heap, where the value of left child is smaller than the parents value and the value of right child is bigger than the parent. This means that an array sorted by Eytzinger order is a binary search tree stored as a heap.
How is Eytzinger order better than Binary search?
When we traverse the Eytzinger ordered array. all the jumps are going in the same direction. This gives the CPU posibility to prefetch data. Binary search on the contrary jumps “randomly” back and forth in the array, making prefetching hard.
Also Eytzinger search is a very simple algorithm. Here is my implementation of it in Rust:
The algorithm has a minimal amount of branching and uses very simple arithmetic operations. If you are confused by
(candidate < value) as usize. It just converts
0 resulting in
index * 2 + 1 + 0 for left child traversal and
1 resulting in
index * 2 + 1 + 1 for right child traversal.
This kind of simplicity pays off. I did some performance benchmarks:
And the results look as following (the numbers are nano seconds, the search was performed on
The Eytzinger search is consistently faster than binary search, sometimes more than 3x.
How is Eytzinger order worse than Binary search?
For Binary search we need the array to be in ascending order. For Eytzinger search we need it in Eytzinger order. The ascending order provides us with couple of benefits. Given an index of value
X in a sorted array we can safely assume that all the values to the left are smaller equal the value
X and all values to the right are bigger equal. With Eytzinger order it is also possible to find values smaller / bigger than
X, but the computation will be much more complex. With Eytzinger order it will be also much harder to find all indexes for value
X, given we allow duplicate entries in the array. With Eytzinger search it is simple to find the first one, but finding all, results in complex sub-sequential traversals.
Last but not least the sorting process itself is well studied and understood for ascending order. There are multiple sorting algorithms, which perform in
O(log n). And can do it in-place, without the need for additional memory allocation.
I could not find any algorithms, which turn a random array to an Eytzinger order directly, so I tried to write my own. After multiple attempts I managed to invent 3 in-place sorting algorithms, which have a rather disappointing runtime performance and I would not recommend to use them on arrays larger than 50 elements. The technique, which seems to scale well (at least in runtime performance) is to use Rust
sort_unstable method, clone vector and copy elements from source vector to the new one according to index mapping.
By index mapping, I mean a mapping, which defines where an element at index
i in the sorted array should be placed in the Eytzinger array. The mapping is based on the size of the vector. For example a vector with 3 elements has a mapping
[1, 0, 2]. Meaning that elements at index
1 in sorted array should be at index
0 in Eytzinger array. The element at index
0 in sorted array should be at index
1 in Eytzinger array. And element at index
2 is at index
2 in both arrays.
Here is the index mapping for vectors of size 1 to 10:
[1, 0, 2],
[2, 1, 3, 0],
[3, 1, 4, 0, 2],
[3, 1, 5, 0, 2, 4],
[3, 1, 5, 0, 2, 4, 6],
[4, 2, 6, 1, 3, 5, 7, 0],
[5, 3, 7, 1, 4, 6, 8, 0, 2],
[6, 3, 8, 1, 5, 7, 9, 0, 2, 4]
For calculation of the index mapping I used following crate:
The author wrote an interesting algorithm to compute the mapping based on the length of the vector. If you are interested how it works in details, please browse the source code.
The code from my field trip into Eytzinger land can be found on Github:
I implemented a simple find first element algorithm in a slice which is sorted in Eytzinger order. The simple algorithm…
There you can also find the 3 in-place eytzinger sort algorithms, which I called March-Sort, Waltz-Sort and Traversal-Sort. Please let me know if you invented a faster in-place Eytzinger sort algorithm ☺️.