How JavaScript works: introduction to Graphs and Trees
This is post # 44 of the series, dedicated to exploring JavaScript and its building components. In the process of identifying and describing the core elements, we also share some rules of thumb we use when building SessionStack, a JavaScript tool for developers to identify, visualize, and reproduce web app bugs through pixel-perfect session replay.
In this article, we are going to dwell more on two Data structures that are quite popular but yet not fully understood. We will be looking closely at Graph and Tree
Before we move on, Data structures can be divided into two categories: linear and non-linear. Both Graph and Trees are non-linear as their elements do not form a sequence.
While looking at both, we will work on their implementations and see more on their similarities and differences.
Graph
A non-linear data structure can be undirected and directed graphs because they consist of a fixed set of vertices and a set of edges. The most important thing to note is that they always have a pairwise relation of edges(arcs) and vertices(nodes).
The image below is a full representation of what a Graph is.
Use cases of a Graph
There are many use cases of the Graph data structure, especially in social networks. For example, on Facebook, each user is represented as a vertex, details like id, name, address are held in it. The relationship between two entities is connected using the edge.
Another use case is in precedence constraints. For example, at University you have to pass a level course before you can take a higher-level course. Some courses precede others which define constraints.
Road networks also use graphs where cities/towns can be vertices and the roads linking them can be the edges.
Types of Graphs
We had said earlier that a Graph can be divided into directed and undirected:
- Directed graph: The directed graph is also called the di-graph which has a pair of ordered vertices. The order of the vertices is really important here. In the image below, (1) the directional edge shows that A has a relationship with B but B does not have a relationship with A.
- Undirected graph: In this graph, pairs have unordered vertices and the edges are not directional. In the image below, (4) the directional edge shows that A and B both have a relationship.
Graph representations
Graphs are represented in several ways. This means that we could have various methods of storing information on the computer. These representations depend on the edges and the kind of operations to be used.
Adjacent Matrix
By definition, an adjacency matrix is a 2D matrix of size VxV with 0s and 1s. So, if a[i][j]
is 1 then there is an edge connecting the vertex i
and j
. Let us see a picture to see the adjacency matrix:
Undirected graph
On the left is an undirected graph with five vertices connected with a corresponding adjacency matrix on the right. All the 0s in the matrix mean that there is no connection between the two vertices.
Directed graph
Looking closely, The Adjacency matrix is different for both the directed and undirected graph. It is 1 in an undirected graph indicating whether an edge connects them.
Pros
- Easier to implement
- It is 0(1) complexity just to remove an edge. We just make the corresponding matrix value turn from 1 to 0.
- Queries are done in 0(1) time. This means it is fast to check if an edge exists between two vertices.
Cons
- Adding a vertex will take 0(V²) time, This is because we will need to reproduce the result of the new graph after a vertex has been added.
- Takes a significant amount of space since the matrix size is
n * n
wheren
is the number of vertices.
Adjacent list
In this representation, an array of linked lists is used to store data. The index of the array is a vertex and the values in the linked list of each array is a vertex as well. The connection between the list is the edges. And also the size of the array is the number of vertices available.
In the image above, vertex 0 has an edge with 1, 2, and 3, which now completes the first array. Vertex 1 has an edge with 0 and 2, which now completes the second array. This is also similar to the vertices2 and 3.
Pros
- Space efficiency. This is because we save only the pairs that are connected by the edge. Unlike the adjacency matrix where we have to store 0 even if the edge is not connecting the vertices.
- Adding a vertex is easier here because we just add the value to the array. Remember that in an adjacency matrix, we are filling the matrix with true/false or the values and this requires 0² space complexity. Meanwhile, in the adjacency list, you are only storing the vertices that are connected in the linked list which is lesser memory.
Cons
- Queries to check if there is an edge between two vertices will take 0(V) time (V is the number of edges in the graph).
Other ways a Graph can be represented are Incidence Matrix and Incidence List.
Tree
The tree data structure can be regarded as a collection of nodes with a parent node. Each node is linked to the other by the edges and can be linked to multiple other nodes. This is almost like the data structure Linked List but the only difference is that each node can be linked to multiple other nodes. The image below shows what a tree looks like:
The two most important things to note are that there is a parent and a child node. The parent node has an edge upward to a node. For example, E has a parent node B, K and L have a parent node E.
A child node on the other hand has an edge down to a node. E has two child nodes K and L, H has a child node M and C has three child nodes F, G, and H.
But there is a constraint: two nodes cannot reference the same node. For example, C and D can not have an edge that connects to H at the same time.
Tree representations
There are various ways the tree-data structure can be represented. Take this as the types of trees we have.
Binary Search Tree
The Binary Search Tree has the following property the left child of each node must have a value less than that of the parent’s and the right child of the same parent node must have a value greater than the parent. And also subtrees of each node are also Binary Search Trees.
So, a Binary Search Tree is a type of tree data structure that has two child nodes and uses the binary search algorithm to search for a value. This binary search algorithm happens in 0(log(n)) time.
Operations
The operations in a binary search tree are:
- Insertion
- Deletion
- Searching
Insertion
Every node added to the tree is made a leaf node. Before we perform an Insertion we need to search the tree because the parent node first has to be compared with the left node if it is less than the parent node and also if the right node is greater than the parent node.
An important thing to note is that if the value of the new node equals the parent node then we have to exclude it because it will become a duplicate. Reason being that it will take more time complexity to perform a search operation because you still have to keep searching the tree even if the value is found.
The steps to insert a new node are stated as follows:
- Create a new node and set the left and right subtree to null.
- Start searching the tree: If the tree is empty, add the new node as root.
- If the tree is not empty, check if the new node is less than the parent root and if it is, adds it to the left. If it is not, check if it is greater than and if it is, add it to the right.
Deletion
Removing a node from a tree can occur in three situations. The nodes in red color below are what is to be deleted.
- Removing the node with no children: Just remove the node.
- Removing a node with one child: Remove the node(red) and replace it with its child.
- Removing a node with two children: Find the in-order node of the actual node(red) to be removed. Replace (4) with it and remove it from its original position.
Searching
To search a node in a tree, we use these steps.
- Check if there is a root node. If no root, the node does not exist.
- If a root exists, we check if it equals the key value to what we are searching for. Search is successful if the values are equal.
- If the search is unsuccessful, we check if the value being searched is less or greater than the parent root.
- If less, move to the left tree and search again. If greater, move to the right tree and search again.
Binary Search Tree complexities
Below is the time complexities for the Binary search tree.
The Space complexity for all the operations is
O(n)
.
Use cases of Binary Search Tree
Understanding this data structure is important because it has some important applications in the industry. A few of them are:
- Indexing in Databases uses the Binary search tree.
- It is used in Unix kernels to store directories because of its logarithmic time lookup.
- Mostly used in search applications where data is always being inserted and removed.
A Binary Search tree is one way of structuring data so we can find elements using binary search. Let’s say we have a physical dictionary and want to find a word in it, we need not search for it linearly or in sequence. What do we do? We use the binary search method. If ‘J’ is our keyword to search and we opened ‘F’, we now know that all the letters below F are useless to us. So we need to search more in the other half of the alphabet.
Implementation of Binary Search Tree
Let us build a BST ourselves so we see how it works. First, we create a class Node which will have three properties: data, left node, and right node.
The data
is what we are going to store and the left
and right
are going to point to the left and right nodes.
Next, we create the Binary Search tree class which will just have a constructor with a root node or parent node.
The first method we will need to implement is the `add` method which will add a node to the tree.
With the data passed into the method, we check if there is a parent node, and if not we check if it is less than the parent node so we move to the left tree. If the data is greater than the parent node, we move to the right tree. From the point where we have to check if it should be in the left or right tree has to be implemented using recursion.
Next, we need to remove from the BST so a remove method will be added:
This remove method is quite completed and longer because of the case when we need to remove a node that has two children.
Searching is another operation we need to add. This method takes in data and while the root node is not equal to the data being checked, we check for each node using a while
loop:
There is more to how a Binary Search Tree is being implemented but you get the picture.
Graph vs Tree
Remember a Graph consists of a set or collection of nodes and edges. Literally, if we had just one node and no edges, then we can not call that Graph.
For a Tree, most people say it is just a type of Graph, which is valid. Because it has nodes and edges. But a node in a Tree must have one parent and there are no cycles. This means a Tree is simply a Graph with no cycles.
Both Graph and Tree share similar algorithms to solve problems. For example, in a Graph, If I had a flight from the city of Sofia to Lagos, what will be the least number of flights needed to take? Or a question in Tree, who is the lowest manager in a corporate organization that sees people A and B.
You have to be very careful about the data structures you pick for certain tasks, especially for ones that could potentially degrade the performance of your product. Having an O(n) lookup complexity for functionality that is frequent and can be reduced to O(logn) could make your product unusable.
Even if you feel like the proper decisions have been made, it’s always necessary to verify that this is indeed true and your users have a great experience with your product.
A solution like SessionStack will allow you to replay customer journeys as videos, showing you how your customers actually experience your product. You can quickly determine whether your product is performing according to their expectations or not. In case you see that something is wrong, you can explore all of the technical details from the user’s browser such as the network, debug information, and everything about their environment so that you can easily understand the problem and resolve it.
There is a free trial if you’d like to give SessionStack a try.
If you missed the previous chapters of the series, you can find them here:
- An overview of the engine, the runtime, and the call stack
- Inside Google’s V8 engine + 5 tips on how to write optimized code
- Memory management + how to handle 4 common memory leaks
- The event loop and the rise of Async programming + 5 ways to better coding with async/await
- Deep dive into WebSockets and HTTP/2 with SSE + how to pick the right path
- A comparison with WebAssembly + why in certain cases it’s better to use it over JavaScript
- The building blocks of Web Workers + 5 cases when you should use them
- Service Workers, their life-cycle, and use case
- The mechanics of Web Push Notifications
- Tracking changes in the DOM using MutationObserver
- The rendering engine and tips to optimize its performance
- Inside the Networking Layer + How to Optimize Its Performance and Security
- Under the hood of CSS and JS animations + how to optimize their performance
- Parsing, Abstract Syntax Trees (ASTs) + 5 tips on how to minimize parse time
- The internals of classes and inheritance + transpiling in Babel and TypeScript
- Storage engines + how to choose the proper storage API
- The internals of Shadow DOM + how to build self-contained components
- WebRTC and the mechanics of peer to peer connectivity
- Under the hood of custom elements + Best practices on building reusable components
- Exceptions + best practices for synchronous and asynchronous code
- 5 types of XSS attacks + tips on preventing them
- CSRF attacks + 7 mitigation strategies
- Iterators + tips on gaining advanced control over generators
- Cryptography + how to deal with man-in-the-middle (MITM) attacks
- Functional style and how it compares to other approaches
- Three types of polymorphism
- Regular expressions (RegExp)
- Introduction to Deno
- Creational, Structural, and Behavioural design patterns + 4 best practices
- Modularity and reusability with MVC
- Cross-browser testing + tips for prerelease browsers
- The “this” variable and the execution context
- High-performing code + 8 optimization tips
- Debugging overview + 4 tips for async code
- Deep dive into call, apply, and bind
- The evolution of graphics
- Dockerizing a Node.js application
- A deep dive into decorators
- Best practices for data compliance
- Arrays vs Hash Tables
- Proxy and Reflect
- SVG and its use cases (part 1)
- Class static blocks + 6 proposed semantics