Re-Implementing Naive Hashing based Maps

Nicola Bernini
Algorithms and Data Structures
2 min readFeb 1, 2019
Original URL: https://static.commonlounge.com/fp/600w/0Zy2NcEg6MNf1bjqXsCKFku071520494728_kc

The idea behind this article is to understand better certain data structure by reimplementing them

What is a Map

Map is a Data Structure typically used to store Key-Values association as it allows to very efficient random access: O(1) read, insert, delete

What makes Map so efficient

The idea that powers this Data Structure is the hash function which essentially maps the key into a memory location of the underlying storage, which is then assumed to offer random accessible

An example of hashing

Let’s first identify the function domain and co-domain so let’s assume

  • key type is string and
  • storage is an array so to access its location a valid index, hence an index in [0..Size), is needed

So such a function can be implemented for example by summing the ASCII values of each char in the string and then computing the module with the array size

What about an example of implementation

Sure here it is

What are the know issues of this implementation

This example shows a very simple Map which just detects, but does not manage, a typical issue affecting hash based systems: hash collision.

It happens because typically hash function has no guarantee to be injective which means that 2 different keys might be associated to the same memory slot.

In this example, it is easy to figure out how this can happen: if two words have the same sum of ASCII chars (e.g. like “ciao” and “biap”) then it results in a hash collision

Is it possible to fix it

Sure, otherwise we won’t have map in production :)

We’ll see how in the next episode

--

--

Nicola Bernini
Algorithms and Data Structures

Machine Learning PhD, Physicist. Mainly interested in Deep Learning, Functional Programming. https://github.com/NicolaBernini