SORTing yourself out

Zheng Jie
9 min readApr 3, 2023

--

Howdy pardner! You may be wondering why on earth such a rudimentary article on Computer Science exists when sites like W3Schools and GeeksForGeeks exist. Me too, but as a future hermit code typer that hasn’t touched grass, it is important to revise the basics and ensure currency. Also because I require a break after 3 hours of literature review.

As someone once said:

Computer Science consists of Data Structures and Algorithms

Just like how Michael Van Gerwan throws his darts, the above quote hits the bullseye, and it is undeniable that the crux of this often misunderstood and overcomplicated subject can be boiled down to those two things; Data Structures and Algorithms.

Thus, welcome to KrashMojo’s Top 10, and without further ado, let me share on just some of the sorting algorithms that I’ve learnt and seen during my tour as an internet user.

Time and Space Complexity

Every employee in the history of companies have some sort of performance indicator or metric for superiors and higher-ups to gauge how well they are doing or if they require a change of roles (or career :( ). Using this analogy, we therefore require a metric to evaluate the speed of sorting algorithms. If we know the time and space complexity of these algorithms, we can make more informed decisions on which algorithms we use based on the intended use case. For example, if we require a sorting algorithm for a search engine (like Google), we obviously wouldn’t use an algorithm that is slower than Biden’s thought process and speech formulation and takes up more space than Nikocado Avocado.

The notion of a complexity is actually more of a mathematical concept and is notated commonly as O(g(n)) or Big-O notation, where g(n) represents a function of n. n here refers to the size of the input (or how many items we want to sort in this case). Big-O notation essentially means: What is the worst will my algorithm run with respect to the size of the input in bytes (or bits)? A time complexity of O(n) for example, means the upper bound of the running time of the algorithm is linear with respect to the size of the input (when the bytes are doubled, the worst-case running time will double), whereas a space complexity of O(log n) means that the upper bound of the space required (to store variables and recursive calls and such) increases by 1 unit per twofold increase in input.

Graphical representation of different complexities (https://danielmiessler.com/study/big-o-notation/)

A more concise way would be to think of it as such:

Complexity = Number of steps * Time/Space per step

The quantity (time or space) may very each step, and upon doing more thorough calculations it is unwise to assume constant quantity per step. A common assumption for Python users is that all operations (addition, subtraction, multiplication, division, length, etc) take O(1) time except for tuple and list concatenation.

Big-O notation is also simplified, ie all coefficients and less significant powers of n are eliminated. If we have a complexity of O(4n + 5), it would simplify to O(n). All log bases are reduced (although it is assumed to be 2 since each byte is a power of 2).

If you would like to know more, another type of time complexity (that I recently found out about) is the pseudo-polynomial time complexity where the algorithm technically only needs O(1) or constant time, but actually requires a higher-power polynomial function in n with respect to the magnitude of the input instead of the size (in bytes). A notable example is the Greatest Common Divisor function (or GCD). It technically runs in O(1) constant time, but has a pseudo-linear complexity of O(min(a, b)) where a and b are starting inputs.

Selection Sort

Commonly known as the “Poker Card Sort”, it resembles how you sort poker cards at a fundamental level (duh).

The algorithm walks through the entire input n or “hand of cards”, and selects the smallest-valued item, and puts it in an external “hand” or array, which is notated as “sorted”. This process repeats until all items or “cards” are in the other “hand” or array, and only then are the items deemed sorted. Note that “sorted” is considered sorted in ascending order.

def selection(arr):
smallest_idx = 0
smallest = arr[0]
length = len(arr)

for start in range(length - 1):
for idx in range(start, length):
if arr[idx] < smallest: # compare to find smallest value
smallest_idx = idx
smallest = arr[idx]

# swap start and smallest so the front part of arr is always sorted
arr[start], arr[smallest_idx] = arr[smallest_idx], arr[start]
smallest = arr[start + 1]

return arr
Visual representation of selection sort: https://codepumpkin.com/selection-sort-algorithms/

This algorithm runs in O() time and O(1) space. In the worst case scenario, the numbers are all sorted in descending order (boooo). Thus, the algorithm has to “walk” to the end of the list every iteration, which results in an arithmetric sum of number of elements iterated through, and since iteration and comparison are O(1) operations:

O(1) * [n + n-1 + …. + 1] = O(1) * [(n + 1) * (n)/ 2],

simplified to → O(1) * (n²) = O(n²)

Another way to think about it is that the outer for loop runs n -1 times, and the inner loop runs n-1 times at most, thus the total time would be (n-1)² * O(1), which simplifies to the same result.

Bubble Sort

Ever wanted to feel like a kid again? With bubble sort, you most certainly cannot (sike) relive your childhood considering you’re subjecting yourself to this pain and misery.

The algorithm for bubble sort is simple:

  1. Compare the first two values
  2. If the first value is larger than the second, swap them and note the swap
  3. Repeat for all pairs of values.
  4. Terminate when no swaps occur after iterating through the array
def bubble(arr):
swap = 1 # set to 1 to enter the while loop
while swap > 0:
swap = 0 # swap variable to count swaps
for idx in range(len(arr) - 1):
if arr[idx] > arr[idx + 1]: # check if number after is smaller
arr[idx], arr[idx + 1] = arr[idx + 1], arr[idx]
swap += 1
return arr

We have a variable swap to keep track of how many swaps occur per “walkthrough” of the array, and if swap is 0 after traversing the array, the loop terminates and the sorted array is returned.

Fun Fact: The reason bubble sort is named as such is because the higher-valued items “bubble upwards” and accumulate towards the top.

The time and space complexities are O() and O(1) respectively.

Insertion Sort

As any competent armed force would say: “Insertion of troops into enemy territory can help win battles”, and so can insertion sort. As the name implies, each item is “inserted” into each respective positions in terms of order.

  1. If an element is smaller than the element before it, swap.
  2. If the element after one swap is still smaller than the one before it in the new position, swap again. Keep swapping until the element is larger than the one before it or if the element reaches the start of the list.
  3. Repeat for all elements.
def insertion(arr):
for counter in range(1, len(arr)):

while counter > 0: # ensure the element doesn't go past the start
before = counter - 1
if arr[counter] < arr[before]:
arr[counter], arr[before] = arr[before], arr[counter] # swap
counter -= 1 # new position of the element
else:
# stop if start of array is reached or if element before is smaller
break

return arr
Visual representation of insertion sort: https://upload.wikimedia.org/wikipedia/commons/c/c8/Bubble-sort-example-300px.gif

The time and space complexities are (again) O() and O(1) respectively.

Spicy Sort

Hey fellow reader! Tired of polynomial-time sorting algorithms? Well, try our new and improved “Merge Sort”! Now with logarithmic space complexity, this new and improved sorting algorithm can help you SORT yourself out in double-quick time! (SORT of).

The underlying concept of Merge Sort relies on the “Divide and Conquer” style of problem-solving and relies on recursion (which I dread admittedly). Here is what it entails:

  1. If the array is one element long, return the array
  2. Divide the array into half
  3. Sort the left and right halves (using itself)
  4. Merge the two lists in ascending order

The merge step in step 3 is rather interesting. Starting with two arrays:

  1. If the first value in the left array is smaller, add that value first (since we want an ascending order). Otherwise, add the first value in right
  2. Remove that first element in left or right (whichever it came from) and repeat until at least one array is empty
  3. Since we can be assured at least one array is empty after the above process, proceed to join the non-empty array (if it exists) with the final array.
  4. Voila!
def merge_sort(arr):
if len(arr) == 1: # if the array is just a number, return that array
return arr

mid = len(arr) // 2
left = merge_sort(arr[:mid]) # sort the left half with itself
right = merge_sort(arr[mid:]) # likewise

final = [] # sorted list
while left and right:
if left[0] <= right[0]: # compare left and right
final.append(left[0])
left.pop(0)
else:
final.append(right[0])
right.pop(0)
final.extend(right) # pick up the residuals if any
final.extend(left) # we can be assured one of these are empty
return final

Delightfully, the time complexity is a blessed O(n log n) time. Here’s why.

Visual representation of insertion sort: https://upload.wikimedia.org/wikipedia/commons/c/c8/Bubble-sort-example-300px.gif

The function merge_sort would split the array in half, then call itself for that half of the array. The time complexity is thus determined by how many times merge_sort is called (recall that complexity = number of steps * time/space taken). We can see that since the function stops calling itself recursively if and only if the array has one element. Thus, we can see that it will split the array log n times (using powers of 2). Since each call to merge_sort takes O(1) time, wouldn’t the time complexity then be O(log n)?

Well hold on skinny cheeks. Remember that the function merge_sort also merges both arrays! Thus, if we have a merge_sort(n), then the time complexity for that specific call would be n! (not a factorial, that would be catastrophic). Thus, considering all function calls:

O(n) + [O(n/2) + O(n/2)] + [O(n/4) + O(n/4) + O(n/4) + O(n/4)] + … + O(1)

We can see that there wouldn’t be log(n) terms but log(n) splits (denoted by the square brackets). We can also see that within each of those log(n) square brackets, the O(g(n)) terms add up to O(n). Thus, since there are log(n) of these O(n) square brackets:

O(n) + O(n) + O(n) + … + O(n) = O(n) * log(n) = O(n log n)

How interesting! Isn’t recursion fun? (sarcasm detected)

BONUS ROUND DING DING DING

Just for the fun of it, here is Bogosort:

import numpy as np

def bogosort(arr):
new = np.array(arr)
shuffled = np.random.shuffle(arr) # shuffle it
while shuffled != sorted(arr): # in one universe...
shuffled = np.random.shuffle(arr)

return shuffled # siuuuuu

Bogosort is the THE MOST EFFICIENT algorithm at sorting. It randomly shuffles the elements in the array and prays it gets it right. If so, it will return the sorted array, if not, it will shuffle again.

There is an ancient saying dating back to Sun Tzu and his Art of War:

“In some unfathomable but fathomable universe in amongst the multiverses, there will be at least one universe where Bogosort has time and space complexities of O(1)”

And to end on that note, I shall quote the famous Albert Einstein, a year after his “miracle year”:

“Just like life, I’d wait for the special moment where Bogosort works, than slave away at O(n log n) sorting algorithms.”

Thank you, and Toodaloo!

--

--