Intimidating Topological Sorting

Mohit Nihalani
Nov 3 · 7 min read

Accept it or not, everyone who is preparing for coding interviews is scared of graph algorithms. While I am also preparing for interviews and brushing up my skills, I thought of sharing one of the applications of the graph; Topological Sort. It is pretty easy, but most of the explanation in the books or online it pretty complex, so in this blog, I will be explaining how I understand the topological sort. I will try to make it as easy as possible.

I hope everyone knows the master of the graph traversal, yes you are correct, DFS or depth-first search and you would have landed in this article because you are looking at one of its applications, but if you don’t know please check out this link before moving forward.

To get warmed up, everyone knows about DAG, you guess correctly, directed acyclic graphs. Directed graphs are the graphs in which you can travel along with the directions of arrows and acyclic means the graph doesn’t form a cycle, though the below example forms it. So forgive me for this example. Just remember the golden rule of DAG, is if we start from a node, no sequences of edges will allow us to loop back, not even self-loop.DFS is one of the best algorithms to find cycles, but that’s for another day. I will write the post to find cycles in both directed and undirected graphs.

DAG is one of the most important concepts of computer science and the backbone of applications handling scheduling, for a system of tasks, which needs to be processed in the order, for example, dependency graphs, or prerequisite course scheduling which means you need to take course x before y.

Topological sorting is a common operation on DAG, it helps to order vertices in a line such that all directed edges go from left to right. Each DAG has such ordering and graph with cycles doesn’t have any, because there is no way you can keep going right and still return bo where you started from. The topological sort helps to find the bottlenecks in the network. The importance of topological sorting is that it gives us the ordering to process each vertex before any of its successors. If two vertices, x and y exist in a graph and a directed edge (x, y) exists between them, then topological sort will order the vertices in the way that we would encounter them in a path: x, followed by y.

The above graph shows the much better solution of DAG and topological sort. One thing you should keep in mind, there is no topological sort for cyclic graphs, since it is unclear where the sort itself should start as every node is dependent on some of the other.

Big Question: How to find such ORDER?

Solution: We will call one of our best friend, DFS. A directed graph is a DAG until no back edges are encountered. What is the back edge? I told you to go through the above link, and learn about DFS, please go now.

Ok, so if you are still here, you know what back edge is, and if you still don't know, backedge is like a safety edge, which connects that connects a node in the graph to one of its ancestors in its tree path, its parent or grandparent node, for example, or even back to itself!.

Algorithm

So if we perform the DFS on the graph, and then label the vertices in the reverse order, that they are marked processed finds a topological sort of DAG. Why is this true, consider what happens to a directed edge {x,y}.

  1. If we encounter edge y, and it is undiscovered yet, then we just start DFS of y before we can continue with x (I hope you went through this DFS). Thus y is marked completed(means I have processed all of its neighbors) before x and x appears before y in the topological order; as it is a must.
  2. If y is visited but not completely processed, then {x,y} is a back edge, so no topological order.
  3. If y is completed, then it must have been labeled before x. Therefore x appears before y in the topological order.

Important Facts about Topological Sorting:

  1. Only DAG’s can be topologically sorted.
  2. Every DAG can be topologically sorted.
  3. DAG’s can be topologically sorted in many ways. When there are n unconstrained jobs, then there are n! permutations of topological ordering.

Steps: For Below Example

  1. We will maintain two arrays visited[0..n] and completed[0..n]. Visited[i] shows that we have encountered this node, and completed will show that we have processed all of its neighbors.
  2. We will start from any node, let it be 1.
  3. Mark visited[1] = true. Then select one of its neighbour 2, mark visited[2] = true.
  4. Visit 3->4 as it is the end put completed[4] = true, and insert that into one of the output lists, then backtrack to 3, then visit its other neighbor 5, which is the dead end, add to the output list, backtrack again to 3, all neighbors of 3 are processed so add 3 to the output list and backtrack to 2.
  5. Backtrack, marking node completed, just as you do in DFS. We’ll continue to do this until we get to a node with an unvisited edge. Each time we do this, we’ll order the vertex as necessary.
  6. Finally, you will end up in the starting node, and then you are done.
  7. Just reverse the list and surprise, surprise you have topo sort of the DAG.
  8. Whenever you visit a node, just check if it visited or completed, if it is visited, then it must be completed, if not then there is a cycle, break out of the loop and print “No cycle found”.
  9. Just that it, you are good to crunch all those hard leetcode questions and nail the programming interviews.

Please try to implement on your own, but here is the sample code, you can look into:

T = []
visited = []

topological_sort( cur_vert, N, adj[][] ){
visited[cur_vert] = true
for i = 0 to N
if adj[cur_vert][i] is true and visited[i] is false
topological_sort(i)
T.insert_in_beginning(cur_vert)
}

Example:

You will tell me, it's a poor blog, just a theory not even a single coding example. Mohit! are you kidding?

Okay, here is one of the leetcode question, Course Schedule.

There are a total of n courses you have to take, labeled from 0 to n-1.

Some courses may have prerequisites, for example, to take course 0 you have to first take course 1, which is expressed as a pair: [0,1]

Given the total number of courses and a list of prerequisite pairs, is it possible for you to finish all the courses? This is a typical topo sort question, here we can see course [0,1] as two vertexes 0 and 1 and for course 0 you have to take course 1, we can visualize this as a directed edge from 0->1.

Please do this first on your own, don’t cheat yourself

I hope you at least tried. I am proud of you guys. Keep it up.

Okay, here is my solution, not the most optimal one, but it is 99% faster than the other solution, so it works for me.

public boolean canFinish(int numCourses, int[][] prerequisites) {
LinkedList<Integer>[] edges = new LinkedList[numCourses];
for(int i = 0;i<numCourses;i++){
edges[i] = new LinkedList<>();
}

for(int i = 0;i< prerequisites.length;i++){
edges[prerequisites[i][1]].add(prerequisites[i][0]);
}
boolean[] visited = new boolean[numCourses];
boolean[] completed = new boolean[numCourses];
//HashSet<Integer> completed = new HashSet<Integer>();
int i = 0;
boolean answer = true;
while(i<numCourses){
if(!visited[i]){
answer = dfs(edges,i,visited,completed);
}
if(!answer) return answer;
i++;
}
return answer;
}
public boolean dfs(LinkedList<Integer>[] edges, int vertex ,boolean[] visited, boolean[] completed){
//System.out.println(vertex);
if(visited[vertex] == true && completed[vertex] == false) return false;
if(visited[vertex] == true && completed[vertex] == true) return true;
visited[vertex] = true;

if(edges[vertex].isEmpty()){
completed[vertex] = true;
return true;
}
boolean result = true;
for(int edge : edges[vertex]){
result = dfs(edges,edge,visited,completed);
if(!result) return result;
}

completed[vertex] = true;
return result;
}

I have created two arrays, visited and completed and I am maintaining and updating the array as I am traversing through the graph, as I just explained above. In this question, I only need to find, if this type of order is possible, not to print the order. So you can work yourself how you will print the order, it just a slight change or you can solve course schedule2 which specifically asks for the order. It’s just a few lines of code here and there, you got this.

Now, just bookmark this page, so next time you feel intimidated by topological sort, you can have a quick glance.

Please put your sincere comments about how you like it, it was my first article, so I know it is a little bit messy (a lot of grammar mistakes), but I will keep improving it, as this is the goal of life. Till then practice hard, keep learning, become a better programmer, and soon I will come up with a new algorithm. Now go and practice some easy leetcode questions (I am kidding they are pretty tough but we can do it together).

Sneak Peek: In the next post, I will discuss KAHN’s Algorithm for topological sort. Yes, there is one more and this one is my favorite. So keep an eye on it.

Some Resources:

Mohit Nihalani

Written by

Computer Science Masters Student at New York University, Courant Institute of Mathematical Sciences

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade