Photo by Dlanor S on Unsplash

Python Dictionary’s less known insight

Rachit Tayal
Published in
3 min readMay 19, 2019

--

I have been programming in Python as part of my job as a Computer Vision engineer. Usually I would browse the Standard library docs, read articles, books to enhance my Pythonic skills.

If you are familiar with any language, you must have come across dictionary data structure. To quote:

A dictionary is an associative array (also known as hashes). … The values of a dictionary can be any Python data type. So dictionaries are unordered key-value-pairs.

If you want to get started with Python Dictionary, you can refer this comprehensive blogpost.
Recently I came across this dictionary expression, which provides a useful insight into the internal working of dictionaries.

>>> {1: 'first', 1.0: 'second'}

What does the above expression evaluates to? Before answering, let’s walk through some of the concepts to be in better shape to answer.

When Python processes any dictionary expression, it starts by creating an empty dictionary object and then assigns the keys and values in the order as provided in the dictionary expression. For example:

>>> d = {1:'one', 2:'two', 3:'three'}----------evaluates to---------->>> d = dict()
>>> d[1] = 'one'
>>> d[2] = 'two'
>>> d[3] = 'three'

Dictionaries check for equality and also compare the hash value to determine if the two keys are the same. Equality is checked by __eq__()method whereas hash value by __hash__() To elaborate more on this, let’s consider the an example. Below is a class EqualObj whose object will always return True when compared with any other object (since __eq__(self)return True every time). However the hash value of two objects will be different since the __hash__(self) method returns the id() of the object. In Python id() returns the address of the object in memory, which is always unique.

>>> class EqualObj:
def __eq__(self, obj):
return True
def __hash__(self):
return id(self)

If we play around with the objects of the above class:

>>> obj1 = EqualObj()
>>> obj2 = EqualObj()
>>> obj1 == obj2
True
>>> obj1 == 'anyObject'
True
>>> hash(obj1)
140698402706224
>>> hash(obj2)
140698402705984
>>> hash(obj1) == hash(obj2)
False

Taken from Standard Python docs

These are the so-called “rich comparison” methods. The correspondence between operator symbols and method names is as follows: x<y calls x.__lt__(y), x<=ycalls x.__le__(y), x==y calls x.__eq__(y), x!=y calls x.__ne__(y), x>ycalls x.__gt__(y), and x>=y calls x.__ge__(y).

Henceforth obj1 == obj2 return True in above case.

if we create a dictionary with objects of above class as keys:

>>> d = {obj1: 'first', obj2: 'second'}
>>> d
{obj1: 'first', obj2: 'second'}

Hence the dict d retains both the keys, since the hash of the keys were different despite the fact that the objects were identical. If we create a dictionary of two objects having same hash but the object themselves won’t be equal, we would observe the same behaviour. Consider the below class again.

class EqualHashObj:
def __hash__(self):
return 1

Instantiating the above class:

>>> obj1 = EqualHashObj()
>>> obj2 = EqualHashObj()
>>> obj1 == obj2 # x == y calls x.__eq__(y)
False
>>> hash(obj1)
1
>>> hash(obj2)
1
>>> hash(obj1) == hash(obj2)
True
>>> d = {obj1: 'first', obj2: 'second'}
>>> d
{obj1: 'first', obj2: 'second'}

We observed the same behaviour if we have the dictionary with keys as object of EqualHashObj.

Python’s dictionaries don’t update the key object itself when the corresponding key value is updated. If the two key objects are evaluated to be identical ( __eq__() & __hash__() returns True), then Python dictionary won’t update the key object. Only the value will be updated. This is shown in below example. This behaviour aligns with the performance optimisation since there is no need to unnecessarily update the key object, when it is essentially the same. If you want to read more on why it is done, you should probably check out hash table data structure, since dictionaries use it internally in most languages and Python is no exception.

>>>d = {1: 'first'}
>>>d[1.0] = 'second'
>>>d
{1: 'second'}

Since the keys 1 and 1.0 are identical, therefore the key 1 won’t be updated, only the value will be updated to second
Hence our first dictionary expression >>> d = {1: 'first', 1.0: 'second'}
will evaluate to d = {1:'second'}. Only single key object will be present 1and it’s value will be the last updated i.e. second.
Now you know what’s happening behind the scenes.
Hope you will find it useful.

--

--

Rachit Tayal
Python Features

Sports Enthusiast | Senior Deep Learning Engineer. Python Blogger @ medium. Background in Machine Learning & Python. Linux and Vim Fan