DS With JS — Hash Tables
Hash Tables have different names in different programming languages and come with slight variations. They are called:
- Maps in Java
- Hashes in Ruby
- Dictionaries in Python
- Objects in JavaScript
Every language has a built-in hash table just like arrays. Hash tables are very important and extremely useful across computer science. We see them a lot in databases or caches.
Hash Table (Objects) in JavaScript
The hash table or object is a structured data type in JavaScript.
It is used to store various keyed collections and more complex entities. — Object | JavaScript | MDN
Consider you found your dream house and it has many properties. How will we represent it using objects?
const my_house = {
has_garden: true,
living_rooms: 1,
study_rooms: 1,
bedrooms: 4,
street_name: 'Main Street',
nearest_highway: 'Highway 401',
nearest_bus_stop: 'Bus Park Depot',
};// Time complexity = O(1)
my_house.living_rooms; // 1// Time complexity = O(1)
my_house.street_name; // 'Main Street'// Time complexity = O(1)
my_house.house_number = 304;
In the object my_house
let us take one property house_number
.
my_house.house_number = 304;
Here house_number
is a key and 304
is the value.
With hash tables, we store key-value pairs.
Other variations of hash tables in JavaScript are Map, Set, WeakMap, and WeakSet.
Notice the time complexity for the operations get
and set
is O(1)
. Why is that?
Hash Function
The time complexity for the operations get
and set
is O(1)
because of the way we store objects in memory.
A black box called hash function takes the key and returns an address or index where the key-value will be stored.
A hash table uses a hash function to compute an index, also called a hash code, into an array of buckets or slots, from which the desired value can be found. During lookup, the key is hashed and the resulting hash indicates where the corresponding value is stored. — Hash table — Wikipedia
With the index in place the operations get
and set
takes constant time.
The hash function generates a value of fixed length for each input. Examples of a hash function are MD5, SHA-1, SHA-256, and more.
Key aspects of Hash Function
With hash functions keep the following things in mind:
- For a given input the output remains the same, i.e., the hash function is deterministic.
- Uppercase and lower cases are treated separately by the hash function. ‘Hello’ and ‘hello’ will generate two different hashes.
- With the output, we cannot go back to the input. The conversion from
key
to address is one way.
The address generated is not ordered hence unlike arrays, the key-value pairs are stored in random order — unordered.
Time Complexity of the Hash Function
Wouldn’t hash function slow things down? Shouldn’t converting each
key
to address be time consuming?
Let us consider my_house.house_number = 304;
again. Here the house_number
will be taken as input to the hash function, say MD5, and results in some string such as 8516ddd8bd901f568cefeadabac2de3
. This output is converted to an index space or address space where we store both the key and value.
This happens really fast and each language has its own implementation of hash functions. And these implementations have the big-O time complexity of O(1)
.
Hash Collisions
Till now while talking about hash functions we did not speak about one of the major pillars of programming — space.
Yes, for each key
we generate an address and store the key-value pair in the memory. But what about space? We know the memory is not infinite. It is limited. What if an address is generated which is not available?
Since the hash function takes space into account, the generation of an address outside memory is not possible. But then raises another problem.
The biggest problem with hash tables is that more than one key
get the same address causing hash collisions.
In these scenarios when we have enough data but limited memory, all the operations will be O(N/k)
where N
is the number of key-value pairs in the object and k
is the size of the hash table. Since k
is a constant we can drop it making the time complexity for lookup O(N)
.
They are unavoidable and have different ways of resolving them. One of the most common ways of solving hash collision is Separate chaining with Linked List.
Now because of hash collision, there can be a possibility that the lookup in hash tables can be O(N)
, where N
is the size of the object.
Hash Table Implementation in JS
In this section let us implement Hash Tables (objects) in JavaScript. The implementation is for viewing the complexity of different operations we do on the hash tables.
Time Complexity for different operations in Hash Tables
We can perform different operations on hash tables, the most common are as follows.
- Insert —
O(1)
- Lookup —
O(1)
- Delete (since it is an unordered list the shifting of items is not required, like in arrays) —
O(1)
The space complexity for Hash Tables: O(N)
, where N
is the number of keys.
Following is the boilerplate code we will be using for implementing hash tables in JavaScript.
Notice the little hash function — it is a very basic one that will do for our little learning.
We will be using an Array, to implement our Hash Table. The reason for this is to implement a simple solution for a hash collision where the size of the array will demonstrate the limited memory.
The this.data
will hold an array of key and value pairs. For the simple insert, the process is as follows
For hash collision, that is when the hash function generates the same address we will push the key-value pair in that address.
As per this solution, we will have to loop through two layers of arrays to fetch the key-pair values. In the worst-case scenario, we will be looping a 2-D array.
Let us see the full code now.
Hash Tables & Arrays
Hash tables look like a hack on top of arrays. Instead of using index
like 0
, 1
, 2
, and so on we use key
which are more readable.
There is no order in Hash Tables while data is ordered in Arrays.
Conclusion
Before choosing hash tables or objects in JavaScript for your program keep a watch on the following points.
✅ Fast lookups (good collision resolution needed)
✅ Fast inserts
✅ Flexible keys
❌ Unordered
❌ Slow key iteration
Hash tables are often used to reduce time complexity, with a trade-off of space complexity which is O(N)
.
I hope this article gives the insight required while choosing objects in your program.
References and Further Reading
The time complexity of Hash Functions
- Why is the cost of a hash lookup O(1) when evaluating the hash function might take more time than that?
- The time complexity of the Hash table
- Complexity of Hashing
- (When) is hash table lookup O(1)?
This blog is a part of the series Data Structure With JavaScript.