Published in

# Timsort: The Underrated Sorting Algorithm

In the world of computers, well defined implementable instructions with an aim to perform a computation are referred to as algorithms. Their usage can range from performing calculations to processing the data to a lot of other tasks. It is derived from a Latin word ‘Algoritmi’.

Algorithms are generally designed independent of language so that they can be implemented in any language to yield the desired output. However, one can even find some language specific algorithms. An algorithm usually have well defined inputs and outputs and are designed in such a way that they do not end up in an infinite loops or other similar situations.

# What is Sorting?

Sorting is one of the most used and popular algorithms out there. It can be defined as the process of arranging items systematically. Terms such as ordering or categorizing often refer to this same process. Almost every computational application ranging from a simple “To-do list” app to an app used for analyzing data uses some form of sorting algorithm to arrange data systematically. In computer science, standard order of sorting often refers to ascending order while reverse order is used to determine the descending order of sort.

There are a wide range of sorting algorithms out there. Some of the most commonly used sorting algorithms are Bubble sort, Insertion sort, Selection sort, Quick sort, Merge sort.

While the students community in the computer science world often debate over the most efficient sorting algorithm out there, it usually ends up being a debate on Quick sort vs Merge sort. Though both the algorithms are highly advanced and offer great efficiency in terms of time and memory space with quick sort often requiring a little more time and less space when compared to Merge sort, there is one algorithm which hardly anyone talks of. A true underdog in the arena of sorting algorithms.

# Timsort: Introduction

Timsort, named after its designer Tim Peters, was designed in 2001 for Python. The way it works is that it uses an algorithm that uses components of both insertion sort and merge sort. It first analyzes the list you are trying to filter and then selects a method based on the list analysis. One might be surprised to know that it is the default sorting algorithm of the Android Operating System, the world’s most popular mobile OS, and the GNU Octave, a software used for performing numerical experiments and analysis. Apart from these, whenever one uses “Arrays.sort()” in Java or “sort()” in Python, the Timsort algorithm is employed to sort the data.

The way Timsort works is that if the array or list provided contains elements less than 64, Timsort behaves like insertion sort while if the number of elements exceeds the mark of 64, an optimized version of merge sort is employed. The reason behind this is that for smaller lists, insertion sort provides better efficiency than merge sort in terms of both time and memory space. When the sample size gets larger, Merge sort is the better options as it is faster than most of the sorting techniques.

When number of elements are greater than 64, the array is divided into smaller blocks ranging from 32 elements to 64 elements and are known as “Run”. Once the Runs are determined, each run is subjected to insertion sort and after every run is in the sorted form, Timsort employs merge sort to stitch the runs back.

The way the size of the Runs are determined by the principle of that combining 2 lists works best when the number of runs is equal to, or slightly less than 2 power. Galloping is another concept used in Timsort to further improve the efficiency of the merging operations.

# How Efficient Is Timsort?

The best case time complexity of Timsort is “n” while the average and worst case are “n log(n)” and “n log(n)” respectively.

From the above table, we can conclude that at it’s best, Tim Sort outperforms both Quick and Merge Sort in terms of speed and is comparable to Merge Sort and at the same time also outperforming quick sort at its worst.

It is in the space complexity where Tim Sorts lags behind Quick sort being on par with Merge sort. Saying so, O(n) is not too bad as well but a minor setback.

# Implementing Timsort

Though it was originally implemented in Python, as every robust algorithm should, it can also be implemented in any other popular programming language out there. The following is my implementation of the Timsort algorithm in Kotlin.

Do note that it is not the complete version but a stripped down version just for explanation purposes. The full fledged implementation is over 3000 lines long and includes a lot of complex functions. If you are interested in the original source code, check the link here.

# Final Words

Timsort is one of the finest sorting techniques ever designed. Utilizing the ideas of merge sort, insertion sort and some of his own creativity, Tim Peters has been able to devise an impressive sorting algorithm that is both impressively fast and stable. Timsort is furthermore optimized to deal well with real-world data. Real-world data is not randomly distributed: it’s common to have sorted runs in the data to be sorted. Compared with a basic merge+insertion sort, Timsort attempts to detect and make use of such runs. This adds a small overhead of checking whether parts of the data are already sorted, but with typical real-world data this saves some actual sorting work which more than compensates.

--

--