How JavaScript works: Arrays vs Hash Tables

Published in

SessionStack Blog

11 min readAug 20, 2021

This is post # 40 of the series, dedicated to exploring JavaScript and its building components. In the process of identifying and describing the core elements, we also share some rules of thumb we use when building SessionStack, a JavaScript tool for developers to identify, visualize, and reproduce web app bugs through pixel-perfect session replay.

Data structures allow computers to efficiently store and manipulate data.

This article will discuss the operations on data structures, focus more on Arrays and Hash Tables, examine their differences, and solve a few algorithmic problems with each of them.

What is a data structure?

A data structure is a collection of values that can only be manipulated by algorithms. It is more like having a case or a wardrobe where you keep your clothes. Each of the clothes represents a particular value. We could have a school bag where we can keep our books too. The wardrobe and school bag which are data structures in this context have their specific way of managing their content.

In programming, we have a lot of data structures that are designed for specific challenges and problems. We can store strings, numbers, objects, arrays, and more. Everything is organized. Here is a list of data structures. Also, it can be said that:

Data structures + algorithms = programs

The two important things about data structures are how to build one and how to use it. Invariably, it is more about how to use them since some of the most used data structures are pre-built.

Data structures are also of different types such as linear and nonlinear. As you go deeper they become complex but then to be a good developer one needs a more understanding if not a fundamental knowledge of data structures (DS).

Let’s get a little deeper into it as we will look at Arrays and Hash tables.

Before we continue, we should know that some languages have specific data types that are not present in others. If JavaScript does not have something like a Stack, we have to build one (or use a third-party library).

Operations on Data Structures

To manipulate stored data we need to perform certain operations on it. Each of these operations is common and can only perform what they are made to do. We will discuss a few of them:

Traversing: Access each data item in the structure exactly once and process it. Here you go straight to the element in the DS.
Insertion: Adding another element in the DS. Insertion has a specific name depending on the DS, for example, in Stack, it is push, in Queue it is enqueue.
Deletion: Removing an element from a DS. In Stack, it is pop, in Queue it is dequeue.
Searching: Looking for a particular element in the DS, checking if it exists or not. It is a frequently used operation.
Access: Here you know the actual location and just want to get the value of the element.

Arrays

Arrays are a sequence of elements organized and identified by their indexes. It is a DS that contains several data values that are visualized in boxes. These values in the array must be of the same type to consider it as a correct array, it is either all values must be an integer, character, or whatsoever but should not be a mix.

The simplest form of an array is the one-dimensional array. There is also a multidimensional array. One good reason why arrays have been more popular is the fact that the indices are very accessible at runtime making it possible to perform iteration or write an interactive statement on many values in the array.

Before we go ahead let us look at some examples using arrays as DS and a few methods or operations on them:

We could also divide arrays into two categories which are static and dynamic.

Static arrays

The size or length of this type of array is determined at compile time (before run time), that is you have to specify the number of indices you want in that array. This type of array is only applicable in low-level languages like C++:

If we need to append a new value into the array, we will have to create a new array and state its new size. The memory is already determined in the RAM. People have come to say that one of the reasons why C++ is faster is because it does not just create unused memory.

Dynamic arrays

Dynamic arrays expand as you add more values to them. We do not need to determine the length of the array, it already resizes itself by doubling its size when we append another value. Let’s take for an example:

we allocate items to a dynamic array, it creates a size for 5 items for the start.
If we append the 6th item, it will double the size to 10.
If we append the 11th item, it will double the size to 20.

Have in mind that it deletes the old array for every append and creates a bigger one and moves the items into the new arrays. This process takes more time and going over this gives a worst case of 0(n). Although it is no big deal, it is great to be aware of this

Implementation

Remember when we said that the two important aspects of DS are how to build it and how to use it. This is what we will be explaining. First, we might want to know how to build an array even though the array is a common and mostly built-in DS in most languages. It is still great to understand how it works. It could come up as a technical interview question.

So, let’s build our array in JavaScript. First, we will create a class with its constructor which will hold two data variables: the length of the array and the values in the array itself.

Our first method in the class is how we would access an element in the array.

Since there is no element in the array then the output is undefined.

Next, we will create another important method that will add to the array. This is more like the append or better still the push method.

In this push method, the value to be added is added to the current length of the data, so if it was length = 0, it gets added to that position. The next line increases the length so if we have another value to push in, it gets added to that position too.

Our result will look like this:

Another functionality we will need to add is the pop method. This method will remove the last value in the array so if there are 6 elements in the array, the last item will be removed. Let’s see how it works:

With our pop method, we locate the last element in the array, delete it and reduce the length by one.

Our result will be:

You can still add your methods if needed but I just wanted to show you how Arrays are built underneath.

Hash Tables

This is my favorite DS and I like to take it as an associative array, elements are stored but these elements have a key associated with them. Okay, let me make it clearer. Hash table is a DS that basically stores information. This information has two components: key and value. So, we map a key to a value.

Hash tables are called differently in different languages. In Python they are dictionaries, in Ruby they are Hashes, and in Java they are Maps. It is one of the most used DS in the industry. They are used for caches which will significantly speed up data access rather than storing in the main database, and also used for database indexing which simply looks through your database table and tries to analyze and summarize so it becomes more like a shortcut.

There is also Map() which was introduced in 2015 and is regarded as a Hashmap. Both Hash Tables and Hashmap provide a key/value functionality but there is a slight difference. Hashmap offers the ability of the keys to be of any data type, unlike Hash Tables where the keys can only be integers and strings. Also, Hashmaps can not be converted to JSON.

Do not forget that Hash tables and Hashmap are Object implementations in JavaScript in case you get confused.

Hash function

This is simply a function that generates a fixed value of length for every input into it. The returned value is usually called the hash value. Okay, let’s make it clearer, the hash function computes an index for us in the Hash Table. It typically just takes in our object property and value, puts it into the hash function and this hash function decides where to put it in memory:

Hash collision

Most times when our hash function decides where to store our object property and value, it sometimes assigns it to the same slot which becomes a collision. This is a problem because every slot in the memory is supposed to store a single value.

If we observe the image below George and Cory have been collided by the hash function and assigned two values to one index:

Implementation

So, just as we built our own array we can do that for a hash table too. First, we create a class, and our first method will be a hash function. We then create an array in the constructor but not using the literal notation. The Array constructor will be used, with, using the size as the parameter.

Next, we create our hash function which will return a hash value. This value will be the location where our data will be stored. We are going to use a very simple hash function as building a reliable one is a complex cryptography topic while our main focus is to build a hash table:

The simple hash function does the following:

takes in the value
loops through each character
returns a character code for each character
multiplies it by the index
adds it to the initial hash value
we use the remainder operator so we don’t get above the length of the Array size

If we run this method we get the value like so:

We then focus on the next method which is the `set` method. It takes in two properties: key and value.

Our key property is being used for the hash function. If the memory location does not exist, we assign an array to it and push the key and value into it.

Next, we create a method that will return the item we had initially placed in the hash table. This is easier because we use the key to get the exact memory location from our hash function. Take a look at our getItem method below:

We need an error checking to see if that memory location even exists.
It’s not a perfect example but you get the idea of the implementation.

Hash Tables vs Arrays

From our observations, we have noticed the differences between hash tables and arrays. Hash tables tend to be faster when it comes to searching for items. In arrays, you have to loop over all items before you find what you are looking for while in a hash table you go directly to the location of the item. Inserting an item is also faster in Hash tables since you just hash the key and insert it. In arrays shifting the items is important first before inserting another one. The image below gives a time complexity of all the operations:

You have to be very careful about the data structures you pick for certain tasks, especially for ones that could potentially degrade the performance of your product. Having an O(n) lookup complexity for functionality that has to be real-time and depends on lots of data could make your product unusable.
Even if you feel like the proper decisions have been made, it’s always necessary to verify that this is indeed true and your users have a great experience with your product.

A solution like SessionStack will allow you to replay customer journeys as videos, showing you how your customers actually experience your product. You can quickly determine whether your product is performing according to their expectations or not. In case you see that something is wrong, you can explore all of the technical details from the user’s browser such as the network, debug information, and everything about their environment so that you can easily understand the problem and resolve it.

There is a free trial if you’d like to give SessionStack a try.