Graph Algorithms — Part 1

Allen Huang
Analytics Vidhya
Published in
7 min readApr 1, 2020

First of all, thanks for Mosh’s incredible course — The Ultimate Data Structures & Algorithms. I learned a lot from his courses, and this article is my notes of the graph algorithm part of his course.

Introduction

A Graph can represent connected objects. In a real-world scenario, it might be a social network, which represent people’s relation. When you want to describe the relationship between a bunch of objects, try to use a graph!

Components, neighbors

Graphs consist of nodes/vertices and edges. If two nodes are directly connected, they are neighbors/adjacent. In this graph, nodes represent different people, edges show the relationship between people. Each pair of nodes are adjacent. We call this kind of graph clique.

clique, indirect graph

By contrast, in the following graph, John is not directly connected with Sam. Therefore, John and Sam are not neighbors.

Direct and Undirect Graph

Above we are focusing on the undirect graph, which means edges do not have a direction. What is a direct graph? Think about Twitter! I am a fan of Stephen Curry and I followed him on Twitter, but he doesn’t follow me. Therefore, the relation between me and Curry needs a direction. That’s Direct Graph.

Direct Graph

Weighted and Unweighted Graph

Edges also can have weights that represent how strong the relationship is. For example, in a social network, if John and Mary have many interactions, their relationship will have a higher weight.

Weighted Graph

Representation

We have several ways to represent a graph. Here we are going to talk about the Adjacency Matrix and Adjacency List.

Adjacency Matrix

If two nodes are connected, we mark the intersection as 1 or true; otherwise, we mark it as 0 or false. We can easily implement this approach using a two-dimensional array. The disadvantage is we need to allocate extra space for this matrix. If we have n nodes, the space complexity is O(V²). (we use V instead of n when dealing with graphs. V refers to the number of nodes)

Adding a new node is O(V²) because we need to create a new array with one more row and column, and copy all of the existing items into our new array.

Also, to remove a node is O(V²) because we need to allocate a new smaller matrix and copy items around.

What about adding an edges? For example, we want to add a relation between John and Mary, the first step is to find the index of (John, Mary).

Hash Table

We can use a HashTable to store indexes. Therefore, adding a new edge only takes us an O(1) operation. To remove an edge go through the same process and it’s also O(1).

Query edges means we want to check if two nodes are connected. What we need to do is finding the index and looking for the value. That’s an O(1) operation.

Finding all of the adjacency of a given node is an O(V) operation. For example, we want to find the neighbors of Bob. First, find the index of Bob, that’s O(1). Then we need to look at every item in this row/column. How many items we have in a row/column? V items.

Adjacency List

Adjacency List

Adjacency List is an array of LinkedLists. Every item in this array is a LinkedList. And every list contains neighbors of a specific node.

This method is more space-efficient because we only store the edges that exist. The space complexity is O(V + E). E refers to total edges. In a worst-case, every node is connected to every other node. We call this kind of graph dense graph. The total number of edges is E = V * (V-1) and the space complexity is O(V²-1+V ) = O(V²). (We only care about the parts V² that grow faster than the rest)

dense graph

Adding a node takes O(1) operation, because we just need to add a new item in our adjacency list. (Here, we can use an ArrayList that is able to automatically increase in length)

To remove a node, firstly we need to remove the element from our adjacency list and make sure no other nodes have a link to this node, which means we need to iterate over the list and remove the target node from every LinkedList.

For example, according to the figure below, we have an adjacency list and we want to remove node 4. Firstly, we need to iterate over the array and remove node 4. That’s an O(V). In each iteration, we need to remove the target node from each LinkedList. Removing an item in a LinkedList takes O(n) operation, but the exact value of n varies from each LinkedList. Each edge in our graph corresponds to an element in the LinkedList. Overall, we have E edges, so there are E elements in our five LinkedLists. So the time complexity is O(V + E). In the worst-case scenario, E = V*(V-1), the time complexity is O(V²).

tips: the key is, number of edges in our graph (E) = total number of items in our LinkedLists.

Adding an edge runs in O(K). For example, we want to add a relation from B to A. In the first place, we need to find the index of B. Then we need to iterate over the LinkedList to make sure this relation does not currently exist. Finally, we can add a new relation. The first steps take only O(1) as a result of our HashMap. It takes O(1) to add a new item at the end of the LinkedList. The second step takes O(K) to iterate over a LinkedList. (K refers to the number of edges a given node has). Therefore, adding an edge is an O(K). In the worst case, a dense graph again, K = V-1, and the time complexity is O(V).

tips: If you are dealing with multigraph, which means two nodes can be connected by different edges. Then we don’t need the second step. Therefore, the time complexity of adding an edge in a multigraph is O(1).

Removing an edge is quite similar to adding an edge. Find the index O(1) -> iterate over the LinkedList to find the target edge O(K) -> remove it O(1). Again, in the worst case, it’s O(V).

Query edge and Finding neighbors in Adjacency List are both an O(K) operation. We need to iterate over the link list to see if the target edge exists (or find all of the neighbors), which takes O(K), and O(V) in the worst case.

comparison

average scenario
worst case (dense graph)

Overall, if you are dealing with a dense graph, using a Matrix is better; otherwise, using a List is more efficient.

Create a Graph using Java

Let’s create a graph class using Java. Mosh uses two HashMaps to create the graph, because it’s more object-oriented. And I also tried to use Array, LinkedList, and HashMap to create the graph.

Here, I would like to explain more about how these two HashMaps work. The first HashMap nodes help us to encapsulate our Node object. Users only need to pass a string as a parameter. The second HashMap is our Adjacency list. Here we are using a HashMap instead of array for quick lookup.

Let’s test our program!

Direct Graph Example

According to the example graph, we are going to add nodes and edges.

public class Main {

public static void main(String[] args) {
Graph g = new Graph();
g.addNode("A");
g.addNode("B");
g.addNode("C");
g.addNode("D");
g.addNode("E");
g.addNode("F");
g.addEdge("A","B");
g.addEdge("D","B");
g.addEdge("B","C");
g.addEdge("C","E");
g.addEdge("E","D");
g.addEdge("E","F");
g.print();
}
}

Program Output:

A is connected to [B]
E is connected to [D, F]
B is connected to [C]
C is connected to [E]
D is connected to [B]

g.removeNode("F");
g.removeEdge("A","B");
g.print();

Program Output:

E is connected to [D]
B is connected to [C]
C is connected to [E]
D is connected to [B]

Finally, I also used HashMap, LinkedList, and array to implement the same function of graph class. The time complexity between these two methods is totally the same.

public class Main {

public static void main(String[] args) {
Graph2 g = new Graph2();
g.addNode("A");
g.addNode("B");
g.addNode("C");
g.addNode("D");
g.addNode("E");
g.addNode("F");
g.addEdge("A","B");
g.addEdge("D","B");
g.addEdge("B","C");
g.addEdge("C","E");
g.addEdge("E","D");
g.addEdge("E","F");
g.removeEdge("A","B");
g.removeNode("F");
g.print();
}
}

Program Output:

A is connected to []
B is connected to [C]
C is connected to [E]
D is connected to [B]
E is connected to [D]

Thanks for your reading! Learning never ends.

Next: Graph Algorithms (2)

--

--