All about sorting in Python: Numbers to Sorting Class Types
Sorting if you have ever used in Python is as simple as
something.sort()
or
sorted(something)
Lets explore in depth.
Warning: A lot of code coming in. All code snippets are images.
Please type code yourself. For reference a git link to the complete notebook is here.
Sorting a list of numbers looks as simple as this:
You have a list of numbers, just call sort method and it gets sorted in-place. Want to reverse, just say reverse = True
and sorts in descending order.
If you have a list of strings, they get sorted lexicographical order (dictionary order, or alphabetic order) based on their numeric codes.
That’s Simple Enough!
What if it’s not a list, lets say tuple of something. Then what?
No problem, we have sorted()
which takes an iterable and gives back a sorted list.
Lets see something else now. Lets say we have a bunch of records for some people. Each record has: name, age, salary. Each record is stored in form of a tuple. For example consider these four records
Sorting the list directly sorts on basis of the first element of each tuple. See output below.
So above code sorts by name by-default.
If we wish to sort by age, then we can use the key
option in the sort method to change the key by which elements get sorted. The key option takes a lambda. This lambda should take a single argument and return one single result. Sort function will use the lambda on each object of the list, in our case on each tuple/record and use the return value of lambda to sort data. See this:
In above code lambda x: x[1]
gives element at index 1 from the tuples. So sorting is done by age and not by name.
Try to sort by salary.
In the older versions of python, this key = callable
feature was not available and hence you would have to use something called as Decorate-Sort-Undecorate approach citing from the official python.org sorting page. See below example:
Sorting Class Objects
While writing this, my intention was to focus on sorting class types. So now put all your focus here. If I were to interview someone for Python at beginner level in industry, this would have been one of the topics on my list.
Lets first create a class whose objects we will sort. Same records but as objects of class Employee
.
Now we have the class and list of employees. Time to sort.
But sorting directly gives and error since python doesn’t know how to sort.
Error says ‘<’ not supported …
What does that mean?
Try to understand from below example.
So if you try to compare 2 objects of a custom class, that is not supported by default and hence the error. Same error you get while calling sort()
as sort will internally compare objects using <.
To fix this make our class objects comparable by the less-than operator. Doing that requires us to define the __lt__()
magic method. Python calls this method when you compare two objects using <
operator. Modified class looks like this:
Now the less-than comparison starts working and that’s the only thing we need. Now sorting will also work.
In __lt__()
magic method comparison is done by name, so the result is sorted by name.
If you still want to change the sort key, we still have the key
option in sort()
. Lets sort by salary now.
Looks like the old way of sort()
with key
argument is a better option than defining __lt__()
. But still defining the less-than magic method allows you to define a default ordering for your class objects, which can be used for things apart from sorting(putting your objects in a heap, or a bst maybe).
Some other ways to sort things
There’s a module operator which contain two functions of our interest attrgetter
and itemgetter
. If you didn’t understand by name have a look at example:
These functions return a function or callable(a fancy name for something that behaves like a function) that can either extract an element from an index (itemgetter) or an attribute from a class object(attrgetter).
Above usage of itemgetter
can be considered equivalent to this:
lambda seq: seq[1]
and of attrgetter
similar to this:
lambda obj: obj.age
And remember equivalent, not exact.
So our sorting code for tuples can be rewritten as:
and for Employee class like this:
That’s it about sorting. If you use some other interesting way as well let me know in comments. We can add it here.