Sorting It All Out
One of the classic problems assigned to Computer Science students is sorting a list of elements. At face value it’s not at all an intimidating problem, in fact possible solutions seem fairly straightforward, and we’ve all undoubtedly sorted a list at some point in everyday life. Of course to some extent that’s true. Let’s consider this randomly generated list of 10 integers for example: [7, 8, 6, 2, 2, 8, 10, 4, 3]
Try and sort it in ascending order. At first glance a number of solutions pop-out to me (admittedly my first glance comes after spending the past couple hours reading/coding sorting algorithms):
1.Bring smallest elements to the front (Selection Sort): The simplest approach to me would be to just look for the minimum element and swap it to the front. Then repeat this process looking at the list starting with index one until you reach the final element.
2. Sort the list from left to right (Insertion Sort): Alternatively you can essentially just build up the list one element at a time. Start with index zero . Now add in index one [7, 8]. Then index two [6, 7, 8]. Carry on until you reach the end of the list.
Both these solutions will work and are really quite easy to implement(maybe 10 lines of code at most). But as is typical in Computer Science the challenge really lies in designing a solution that’s reasonably scalable and efficient.
I chose to implement Quicksort for a couple reasons:
a) It exhibits the optimal O(n log n) runtime in the average case
b) I thought the divide and conquer approach it uses was quite elegant
c) It’s the sort used by Java’s Arrays.Sort function which I figured to be an excellent endorsement
Informally the Quicksort algorithm works by randomly selecting an element, referred to as the partitioning element, and then splits the array in (ideally half) by swapping elements less than the partitioning element to the left and greater to the right. Continue partitioning with the left and right groups until the partitions reach length one or zero. The advantage then is with each progressive partition you only need to deal with hopefully half the elements as before. Check out this Princeton article for a more thorough walk-through of how Quicksort works step-by-step. Repeat this approximately n/2 times and we get O(n log n) complexity. Of course in the worst case, when each partition element selected happens to be the smallest or largest element in the grouping we end up breaking the array into chunks of one element and n-1 elements, essentially sorting the array one element at a time in O(n²).
def sort(array, start, end):
p = partition3(array, start, end)
sort(array, start, p-1)
sort(array, p+1, end)
def partition3(A, lo, hi):
pivot = A[hi]
i = lo - 1
for j in range(lo, hi):
if (A[j] < pivot):
temp = A[i]
A[i] = A[j]
A[j] = temp
if (A[hi] < A[i+1]):
temp = A[i+1]
A[i+1] = A[hi]
A[hi] = temp
Results of performance tests for the sorts was, in a word, underwhelming. I tested the sorts both against a small selection of large arrays and a large selection of small arrays. Neither returned favorable results, with Insertion Sort holding a sizable lead over Quicksort in both cases. Here’s the results as run from my laptop.
All in all, it was a sobering lesson on the gap between algorithmic complexity and actual performance. Hopefully I’ll get around to adding some optimizations to the Quicksort, namely switching to Quicksort when the partitions have reached a small input size, and rewriting the partitioning so that the depth of the recursive calls can be bounded to O(log n).