Sorting in Rust: Selection, Insertion, and Counting Sort
Sorting is an invaluable skill and often covered early in a computer science curriculum. Have you ever tried to look up a friends phone number in an unsorted list!? You’d have to look at every single entry. Sorting creates all sorts of ways to access data quicker.
Although languages provide inbuilt sorting functions, writing your own can sometimes produce faster code. Sorting also provides some excellent algorithms to implement in Rust.
Today you’ll implement some fundamental sorting algorithms in Rust and see an example of a really fast sort. If you have never set up Rust and want to follow along, check out this article first: “Beginner Bites: A taste of Rust, a safe, concurrent and practical language!”
Selection Sort
This sort is rarely used in practice, but is often taught first as it’s easy to understand.
The algorithm views the list as two pieces. A sorted part, and an unsorted part. At the start the sorted part is empty, and the entire list is the unsorted part. One at a time the smallest element in the unsorted part is moved to the end of the sorted part.
Visually it looks something like this:
[][4,2,1,3]->[1][2,4,3]->[1,2][4,3]->[1,2,3][4]->[1,2,3,4][]
When implementing the algorithm we only use a single list, and can swap the elements into the last position of the sorted part. The above example shown with swaps:
(swap 1 and 4) (no swap) (swap 3 and 4)
[4,2,1,3] -> [1,2,4,3] -> [1,2,4,3] -> [1,2,3,4]
You can imagine the sorted list building up.
This is a slow algorithm, because every time you want to put something in the sorted list, every element in the unsorted portion must be compared to find the minimum element.
The pseudocode is as follows:
selection_sort(list):
for i <- 0..n:
small = i
for j <- i+1 .. n:
if list[j] < list[small]:
small = j
// swap smallest onto end of sorted portion
list[small], list[i] = list[i], list[small]
The i
variable always points to the end of the sorted portion of the array, whilst the j
variable traverses the unsorted portion of the list finding the smallest element.
We can do a straight conversion to Rust:
fn selection_sort(list: &mut [i64]) {
for i in 0..list.len() {
let mut small = i;
for j in (i + 1)..list.len() {
if list[j] < list[small] {
small = j;
}
}
list.swap(small, i);
}
}
selection_sort
takes a single argument list
with the type &mut [i64]
. We’ll make this more general later, but for now we’re only taking in slices of 64 bit integers. The &mut
tells the caller of the function that we’ll be mutating their slice. In Rust mutability is explicit.
The 0..list.len()
syntax defines a range between 0 (inclusive) and the length of the list (exclusive) or [0..n)
.
Items are swapped using the swap
method.
This function can be improved if we allow more than just slices of i64’s. This function should be able to sort anything that can be comparable with <
, >
, and ==
. This can be accomplished using the Ord
trait.
A trait is sort of like an interface, saying that a particular type has particular methods. If you implement the Ord
trait on a type, you’re basically defining certain methods which are called when using the various comparison operator. Implement Ord
on your own types and learn about Ord
here.
You can change our function’s type signature to reflect this:
fn selection_sort<T: Ord>(list: &mut [T]) {
// ... code here ...
}
This function now accepts any type T
, provided the type T
implements the Ord
trait.
Now our function can be used more flexibly, but it’s still a bit ugly. Iterators can make the inner loop more expressive.
Replace the body of the function with:
for i in 0..list.len() {
if let Some((j, _)) = list.iter()
.enumerate()
.skip(i)
.min_by_key(|x| x.1) {
list.swap(i, j);
}
}
There’s a lot of Rust goodness here. Let’s go through the iterator first and leave the if
statement till last. iter()
gives us an iterator over the elements in the list
. Now we have access to all these methods!
enumerate
packages up each item in the iterator in a tuple with the item’s index:
list.iter(): [4, 3, 1, 2]
list.iter().enumerate(): [(0,4), (1,3), (2,1), (3,2)]
skip(i)
skips the sorted part of the list.
min_by_key
takes a closure, allowing you to specify what part of the item you want to find the minimum value in. In this situation we want the element at tuple index 1
. (Otherwise you’re finding the minimum by the index in the iterator.) x.1
gives the tuple element at index 1
.
min_by_key
returns an Option
. The Option
type can either be something or nothing. This is a way to never deal with null in Rust. If a function can ever return nothing, then the function will return something of type Option
. This can happen when finding the minimum of an empty list.
The if
statement pattern matches on Some(<value in here>)
. In this case specifically matching the first element of the tuple, which is the index we want to swap.
And that’s selection sort.
🤔 What happens if you run the algorithm on a sorted list? What happens if you run the algorithm on a partially sorted list? 🤔
Insertion Sort
Insertion sort is a great when your list is close to being sorted, or you’re sorting elements as they come in. Like the previous sort, insertion sort can be thought of as two separate lists. The left list is a sorted partial result, and the right list is unsorted. Insertion sort works by inserting the first element of the unsorted list into the correct position in the sorted partial list, thus growing the partial solution.
(2 inserted) (1 inserted) (3 inserted)
[4,2,1,3] -> [2,4,1,3] -> [1,2,4,3] -> [1,2,3,4]
[4][2,1,3]-> [2,4][1,3]-> [1,2,4][3]-> [1,2,3,4][]
Above is a similar example as the selection sort example, and below is a lovely gif from wikipedia.
🤔 What happens if you run insertion sort on a sorted list? What if the list is almost sorted? Is it more or less efficient than the Selection Sort? 🤔
Insertion sort can be implemented in the following way:
fn insertion_sort<T: Ord>(list: &mut [T]) {
for i in 1..list.len() {
for j in (1..i + 1).rev() {
if list[j - 1] <= list[j] { break; }
list.swap(j - 1, j)
}
}
}
I ran a quick benchmark over the two sorts using cargo bench
and got the following results:
test insertion ... bench: 86,197 ns/iter (+/- 58,163)
test selection ... bench: 100,634 ns/iter (+/- 15,839)
Just by eyeballing these you can see that the variation in running times is much higher for insertion sort, but it is quicker on average. These benchmarks are informal and running against random lists.
Counting Sort
This is a very fast sort for specific types of data. Counting sort is able to look at each element in the list exactly once, and with no comparisons generate a sorted list.
Sadly this algorithm can only be run on discrete data types. This means you know exactly what data you’ll encounter. The algorithm then maintains a tally of elements it has seen as it moves over the list. From this tally we can build a sorted list.
For example, if you have a long list containing only 0’s and 1’s. Count the 0’s and the 1’s, and then build a new list. This can be done without comparing elements.
Here’s an example of the counting sort on a list of 0’s and 1's:
pub fn count_sort_binary(list: &mut [u8]) {
let (zero_count, one_count) = list.iter()
.fold((0, 0),
|(zero, one), &el| {
if el == 0 {
(zero + 1, one)
} else {
(zero, one + 1)
}
});
for i in 0..zero_count {
list[i] = 0;
}
for i in zero_count..list.len() {
list[i] = 1;
}
}
I did another quick informal benchmark, testing the three algorithms against one another. I also added Rusts inbuilt .sort()
method to the benchmarks. This time sorting random binary lists.
Note: These benchmarks include the time taken to generate large lists as well as the sorting. Because the list generation is the same for each benchmark, I haven’t deducted that time.
test count ... bench: 2,350,004 ns/iter (+/- 368,795)
test inbuilt ... bench: 3,079,697 ns/iter (+/- 396,763)
test insertion ... bench: 9,515,089 ns/iter (+/- 1,499,061)
test selection ... bench: 51,361,635 ns/iter (+/- 21,383,817)
The results speak for themselves, although wow Rust has a quick inbuilt sort.
Have a go at implementing your own counting sort algorithm with the problems below:
Counting sort 1: https://www.hackerrank.com/challenges/countingsort1/problem
Counting sort 2: https://www.hackerrank.com/challenges/countingsort2/problem
Full counting sort: https://www.hackerrank.com/challenges/countingsort4/problem
Extension
Check out the radix sort to sort large numbers using your counting sort skills.
If you liked what you read, I’d love it if you showed it 👏👏👏. I love feedback so please leave a comment or message me on Twitter. I also love your ideas. If there is a blog post you wished existed on Rust, please get in contact with me.
Follow/message me on Twitter: @youCodeThings
❤ Thank you for reading! ❤
Edit: Thanks to Reddit users stjepang and JohnnyLight416 for corrections.