Python’s Collections

Mehul Gala
4 min readNov 17, 2019

--

In the last two stories, we covered a few of Python’s handy features to write efficient code. In this section, we’ll peek into Python’s Collections which offers a range of cool advanced data structures (on top of the built-in ones) which you would need ‘once in a while’ dealing with complex problems.

Python provides the following built-in data structures for regular work.

List

An ordered collection of items. Heterogenous (can store items of more than one data type). Mutable (We can add or remove data at any point in time in its lifecycle, allowing it to stretch or shrink to any size). Random access (We can directly extract any random element).

The list is the most popular and widely used DS in Python. It combines the power of arrays (by giving us an illusion of contiguous memory locations thereby allowing random access) and Linked Lists (dynamic sizing and heterogeneity) and presents a hybrid DS which has the ‘best of both’ world.

Tuple

Almost identical to the list except for one feature. It is immutable. (We can only add items at the time of its creation, and not later on).

Set

An unordered collection of items. Unique (sets cannot hold duplicate items like lists and tuples). Mutable. Constant lookup time (Sets use hashmaps to store data, and they provide nearly constant O(1) lookup).

Dictionaries

An unordered collection of key-value pairs. Unique keys. Mutable. A popular DS to store data of the form ‘key-value pair’, and get constant lookup based on the key.

In most of your regular use-cases, your needs will be fulfilled with these DS. Python provides an additional Collection module that has few advanced data structures which can be used in more specific use cases.

Note: These DS are not part of the build-in scope. You would have to import them from the “Collections” module to make use of them.

Counter

It is a very interesting and useful DS. It takes in any iterable, counts the frequency of items present in it, and returns a dictionary of item-frequency pairs.

It is useful when our use case deals with the frequency of items as seen in this Poker Game example.

NamedTuple

Since tuples are immutable, a lot of read-only information exchange happens through tuples. The major drawback of the Tuples is that to extract the information we need to use the index which doesn’t convey any meaning to the information placed on that position.

Named tuples are extensions to the tuples where items from the tuples can be extracted using relevant names.

OrderedDict

They are just like the normal dictionary except one difference, they remember the order in which the items were added in it. For example, say we want to store the scores of the batsman in the recent ODI match, we will store it as the name of the batsman vs score, but to maintain the position in which they battered we would make use of the ordered dictionary as opposed to the normal dictionary.

Note: From Python 3.7, the normal dictionaries have started to remember the order in which the items were inserted, so when this Python version becomes widely adopted, the need for ordered dictionary collection will go away.

DefaultDict

If we try to access any key from a dictionary that does not exist, it raises the “KeyError” exception which we will have to handle in our code.

Otherwise, we need to first make check whether the ‘key’ exists in the dictionary using the “in” operator, before accessing any value. This can be a little annoying task especially when we had to do for all dictionary accesses.

The default dictionary solves this problem, by returning a default value if the key does not exist. This simplifies the caller’s worries.

Deque

Stacks and Queue are fundamental DS which you could build easily on top of the readily available built-in DS.

In Python, we can easily treat the list as a Stack, with its Push and Pop methods, and -1 index for the top item lookup. By convention, you can also treat the list as a Queue, but it is more preferable to use the standard library Queue module for this purpose which keeps things organized under the hood.

In the Collections module, Python provides Deque which is the double-ended queue for specific use cases (popular being the sliding window use case).

You would not need these advanced DS for regular work, but it’s good to know that they exist so that you can make use of them if they deem fit for your task at hand. Happy Coding!!

--

--