Simple Hash Map with Scala in 15 minutes
You are getting started with Scala?
You already passed the “Hello World” example and don’t know where to go next?
Then this post can help you. Implementing a Hash Map is a good way to get familiar with a new language, let's see how it can be done in Scala.
This article assumes that you are familiar with the data structure. If not, Hacker Earth has a really nice article to get you started.
Now, for this implementation, there are a few things I want to achieve:
- use chaining to resolve collisions
- support 3 operations:
add
,remove
, andget
- immutability (all modifications return a new map instead of changing the existing one) (but why?)
With those requirements in mind, let's write a short outline:
case class Entry[K, V](key: K, value: V)
class HashMap[K, V] private (entries: Vector[Vector[Entry[K, V]]]) {
def add(key: K, value: V): HashMap[K, V] = ???
def remove(key: K): HashMap[K, V] = ???
def get(key: K): Option[V] = ???
}
- Class
Entry
represents a key-value pair - I use
Vector
to hold the keys. Each key contains anotherVector
that holds all the values. Hence we haveVector[Vector[Entry[K, V]]]
- Since my hash map is immutable,
add
must return a new hash map. The original map is left untouched. get
returns anOption
that contains the value of that key if it exists andNone
otherwise???
meansthrow new NotImplementedException()
, which is quite convenient for writing code outline
Method add
is quite straight forward:
def add(key: K, value: V): HashMap[K, V] = {
val idx = indexFor(key)
// if the table is empty, initialize and then run 'add' again
if(entries.isEmpty) init.add(key, value)
// otherwise, if 'key' exists, replace its old value
// if not, associate 'value' with 'key'
else {
val chain = entries(idx)
chain.indexWhere(_.key == key) match {
case -1 => // key not found
val e = Entry(key, value)
new HashMap(entries.updated(idx, e +: chain))
case i =>
val replaced = chain(i).copy(value = value)
new HashMap(entries.updated(idx, chain.updated(i, replaced)))
}
}
}
private val initialCapacity = 16
private def init: HashMap[K, V] = {
new HashMap(Vector.fill(initialCapacity)(Vector.empty))
}
/** Returns the index of this key in the internal entry vector. */
private def indexFor(key: K): Int = {
key.hashCode() & entries.length
}
remove
is a bit tricky to write since we cannot do an 'in-place' removal (ahem, immutability ... ). So we have to use filter
:
def remove(key: K): HashMap[K, V] = {
val idx = indexFor(key)
val updated = entries.updated(idx, entries(idx).filter(_.key != key))
new HashMap(updated)
}
However, this is not the most efficient way because filter
will go through the whole collection instead of stopping once the unwanted element is seen. A more efficient approach to this case is to not remove anything at all but just 'mark' the element as removed. But I will not implement that here 😅
And lastly, we need to write out get
method:
def get(key: K): Option[V] = {
val idx = indexFor(key)
entries(idx).find(_.key == key).map(_.value)
}
Method find
of Vector
already returns an Option
if there is no element match with key
so we don't need to manually handle that. Utility methods like find
is the reason why I love Scala so much.
It allows me to leave work early.
You can use this new HashMap like this:
val map = new HashMap[Int, String](Vector.empty)
val web = map.add(1, "web") // 1 -> "web"
val dev = web.add(2, "dev") // 1 -> "web", 2 -> "dev"
And that’s it! We have completed our Hash Map implementation that satisfies the requirements we made at the beginning.
Notes
- Readers may notice that
Array
can be used for faster indexed access. Here, I useVector
to keep everything immutable. - This implementation uses a constant capacity (and it shouldn't). In practice, you should add some logic in method
add
to increase the capacity when needed. This is important to maintain performance.