Dijkstra — The Person & Algorithm

Balaje Shrinivaas
16 min readOct 14, 2023

--

A brief history of Dijkstra, his shortest path finding algorithm & implementation using Neo4j Graph Database

Table of Contents

Part 1
- Getting to know Edsger W. Dijkstra, the Computing genius and the events around the discovery of his famous namesake Algorithm (General Reading)

Part 2
- Exploring Dijkstra’s Algorithm in detail with two examples. We’ll also cover the limitations (General Reading)

Part 3
- Implementing Dijkstra’s Algorithm in Neo4j Graph Database (With Code Snippets)

Part 1: Dijkstra — The Person

It started as just another day in the year 1956. For the young Dijkstra, it was just another shopping jaunt with his fiancée in the streets of Amsterdam. Like every other couple, they took a short break in a café. While sipping the coffee, something else was running at the back of his mind.

Edsger Dijkstra

Edsger Dijkstra was writing the software for ARMAC, a computer developed in the Mathematical Centre, Netherlands. It was a revolutionary design at that time and was way better than the previous ARRA series of computers. Look at the sheer size of a computer back then!

ARMAC Computer, Mathematical Centre, Amsterdam

He had a challenge, though. How could he demonstrate to the world how powerful the computer was? Let’s take a step back.

How would one show the true power of any cutting-edge technology?

Google Deep Mind developed AlphaGo a few years ago. It made it play the game “Go” against the revered Grandmaster Lee Seedol in March 2016. AlphaGo went on to win 4 out of 5 matches. The 37th move in the second game took every Go player by surprise. That was one in Ten Thousand probability move for a human. It was a watershed moment for Artificial Intelligence.

Let’s abstract what happened here. To show the art of the possible for any new piece of technology, we need a use case that, when explained, is understandable by humans but not easily achievable because of the sheer volume of data and capability to perform the iterations in time to make decisions. In fact, AlphaGo sowed the seeds for its 2023 Albert Lasker award-winning variant AlphaFold, which predicts the three-dimensional protein structure from a one-dimensional amino acid sequence.

Dijkstra faced precisely the same challenge at that time. He needed the equivalent of AlphaGo to win the scientific community’s confidence before he moved on to solve what he thought in his mind as complex real-world problems. What could that demo use case be? One cannot show simple addition or multiplication of large numbers. We are not talking about Calculators. The problem has to be much more difficult.

When he took that short break in that café, this singular question buzzed like a bee in his mind. And then it came !!

While numerous paths exist from Rotterdam to Groningen, What is the shortest path one can travel?

Today, one would simply do the following, not an option available back in the time while Dijkstra was pondering this question.

How we search today

This question morphed into a much more profound question in the next minute.

How do you find the shortest path to travel from anywhere to anywhere?

With no pen or paper, in less than twenty minutes, in his mind, he worked out a procedure to find the shortest path from a given source (starting point) to all targets (destination points) in a network.

Using that technique, ARMAC could find the shortest path between 64 cities in the Netherlands. Dijkstra was just focused on the ARMAC and published the procedure after three full years in 1959. It was a simple 3-page publication, and the shortest path covered only in the second half of the publication

Dijkstra’s original publication of Shortest Path Finding Algorithm,1959

To his surprise, In 1960, Dijkstra saw his procedure in a German book of Management Science named after him “Das Dijkstra’sche Verfahren”, or Dijkstra’s procedure. From that point onwards, it was known by its current name, Dijkstra’s algorithm.

History shows that the fifties and sixties were one of the most prolific periods for algorithm invention. The race to land a man on the moon, the Cold War and the rapid industrialization after the World War contributed to this pace. The most important thing to note is Dijkstra’s algorithm was one of many shortest-path algorithms identified in the fifties and sixties. What makes it so unique then? To understand the significance of what Dijkstra figured out in his mind over a cup of coffee, we’ll lean on the shoulders of the Time Complexity!

In the world of computing and algorithms, Time Complexity, in simple terms, is not the time taken to complete a process but the number of times or number of operations or iterations one must go through for a given size of an input. It is expressed using Big O (read as ‘oh’) notation where the O stands for Order of the function.

For example, whether it is one page or ten pages, the “Print” button is pressed only once to print any number of pages. The print example is called a Constant Time process or O(1)

Assume there are 100 apartments in a housing community. For a search to find if there are green walls in any of the houses, the search party need to pay a visit to all 100 homes. Isn’t it? That is called the Linear Time Process. In Big O terms, for a given ’n’ number of inputs, Time Complexity is O(n), where the worst-case scenario is arrived at by iterating n number of times.

Let’s try a two-player number-guessing game. Tom will think of a number between 0 and 256, and Peter should find out. One approach is that Peter asks every number, and Tom responds with a yes or no. This approach is no different from the previous problem; hence, we know it is a linear time complexity approach.

What if we change the rules of the game, and at every step, Peter will split the numbers into two equal groups and ask Tom which group the number belongs to?

Assuming Tom had the number 149 in his mind, Peter would find it out in 8 iterations.

Steps

Peter can find any number between 0 to 1024 in 10 steps using the same approach or even a number from 0 to 33,554,432 in 25 steps. Did you spot the pattern?

Exponential Form

If we jog our memory from what we learned in high school mathematics, we can write the same equation using logarithms.

Logarithmic Form

The Time Complexity for searching a number in the new approach from 0 to n will be O(log n), or Logarithmic Time. As you can see, a massive increase in ‘n’ results only in a minuscule incremental number of steps, and any day is way better than a Linear Time process. In the world of Search, the method we used in this example is called Binary Search.

There is a reason why we took a detour to understand Time Complexity as a concept.

Graph

For G =(V, E) or in plain English, a Graph containing a set of Vertices V and Edges E, the algorithm from Bellman-Ford that was published in 1956 for finding the shortest path has a Time Complexity of O(E * V), while Dijkstra’s algorithm has a time complexity of O(V + E * logV) while employing minimum priority queue approach. For the moment, ignore the approach. Did you see that logV in place of V?

Dijkstra’s version of path finding became popular not because it was the first algorithm to do so, but because it delivered the shortest path using fewer operations.

For a Graph containing 20 Edges and 16 vertices,

Bellman-ford Time complexity = O (E * V ) = O ( 20 * 16 ) = O ( 320 )

Dijkstra O (V + E * logV) = O (16 + 20 * log 16 ) = O ( 16 + 20 * 4 ) = O (96)

In this example, irrespective of how much compute power one packs, the Dijkstra algorithm will need only 30% of the time Bellman-Ford will take.

When you go through the algorithm, you will realize one more thing. It is an Exact Algorithm. Unlike other algorithms, it always gives a definitive and the most optimal result.

The most fascinating part is Dijkstra saw this as a mere algorithm for the ARMAC computer demo. The world has forgotten ARMAC, but Dijkstra’s algorithm laid the foundation for all future classes of path-finding algorithms.

Dijkstra never rested on his laurels from the path-finding algorithm. He was one of the world’s first batch of programmers. The man was so ahead of his time that he gave “Programmer” as a profession while registering his marriage, only to be rejected by the Dutch authorities because they contested that there is no such profession. Remember, we are talking about the fifties. He finally gave “Theoretical Physicist”, his original specialization, to get the documentation done;-)

He coined what we know today as “Structured Programming”. He has a funny way of saying things. In 1968, he published his famous paper “Go To Statement Considered Harmful”, where he argues that one should abolish the Go To statement while writing software programs. Man, he was that angry with GoTo ;-) Suppose there are 100 lines of code. In the 10th line, if the program has a GoTo line 24, i.e., a statement directing the program to jump to line 24 and later in line 50, if there is a code re-direction to line 14. This results in a “Spaghetti code”, a piece of code that is difficult to follow, maintain and scale. He always argued vehemently for a code to follow a structure and not simply jump from one section of code to another but follow a

IF
The program meets Condition 1
THEN
Do Something — 1
ELSE
Do Some other thing-2

Today, these are considered to be the basics of programming. Along with some of his peers, Dijkstra has played a vital role in bringing these concepts to the mainstream.

A Compiler is a language which converts the main programming language into machine language, the language that is understood by the underlying hardware. Ever heard of playoff beards? In Ice Hockey, there is this tradition where some players and fans don’t shave till their team either wins or gets eliminated from the championship. Dijkstra is also well known for co-writing the compiler language for ALGOL 60, and his beard started as something he would not shave until he finished writing the code for the compiler. He successfully finished the compiler, but that signature beard stuck with him for the rest of his life.

Edsger Dijksra , Augut 2001

Dijkstra had this habit of handwriting several of his research pieces and opinions. It was so distinct and famous in research circles that in 1980 a Dijkstra font came into existence for computers. Next time you see the font in the dropdown option, you’ll surely know the history.

Dijkstra handwritten Font

For his contributions, In 1972, Dijkstra received the Turing Award, which is considered the equivalent of the Nobel Prize in computer science.

Part 2: Dijkstra — The Algorithm

We now know the colourful history of how the algorithm was discovered. It’s time to get to know the algorithm.

We’ll take the following Graph as our example.

Problem to Solve

We have taken a Directed Graph, i.e. we do not have lines but arrows as Edges indicating the direction of movement. Let us now find the shortest path using Dijkstra’s algorithm. I’ve tried to capture every iteration with different types of operations we perform during the course of the algorithm.

Iteration 0
Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Iteration 6
Iteration 7
Iteration 8

At the end of all iterations, we have the following

End of all iterations

The Green dotted line in the final iteration belongs to the shortest path from a single source to all destinations.

Transformation
Shortest Path from A to every destination

Interestingly, on a directed graph, it boils down to reducing a given graph to a new one such that every node has a single incoming edge or, to put it differently, a single previous vertex.

Summary of all iterations

If we observe our final graph, except for Vertex F & H, the graph has not changed at all.

Hence, as a corollary on a directed graph, when a Vertex has only one incoming edge, we should know it will belong to the final shortest path graph.

By the way, here is a short summary of the different operations we performed to reach the final state :

Operations

How on earth did he figure all of this in less than 20 minutes? If this is not genius, what else is?

Video Game problem

We are given the challenge to find a way to identify the shortest distance from the source cell B1 to all other cells in a video game. To make the problem interesting, certain cells are blocked, and we are told only the Chess Rook style movements are possible, i.e. forward, backwards and sideways, and no diagonal movement is allowed.

We know each cell is one step away from the adjacent cell. Here is how we transform this problem into a Graph problem to find the shortest path

Let us perform all the iterations based on our dear Dijkstra’s algorithm.

To highlight the links that will be removed from the shortest path

We now have the shortest path from the source cell B1 to reach every cell.

At the end of every iteration, we pick the vertex closest to the source. If there are multiple vertices that are at the same shortest distance, we pick one of them to explore in the next iteration. Our final path arrangement will be different depending on which Vertex we pick.

Irrespective of which final shortest path arrangement we end up with, vertices are at the same distance from the source in all the possibilities.

Limitations of Dijkstra’s algorithm

We looked at two examples. In both of them, we had positive weights on the edges. What if there are negative weights? By the way, are even negative weights possible in a graph?

Let us take the graph from our first example and make a small change. This time around, let us assume the weights are not distances but the amount charged to our credit card to reach different vertices. Segment G → H is a new route the Government has opened. Additionally, anyone who takes that route for the first week gets $5 credited to their card. Since the weight is defined as “Debit Entry”, we’ll have to use a negative value to mark it a Credit.

Graph with negative weight

Dijkstra’s algorithm will not support such a graph.

It is designed to work only when the weights are non-negative numbers

Bellman-Ford’s version works for Graphs with negative weights, even if it means at higher Time Complexity.

Dijkstra’s algorithm is a fantastic choice when finding the shortest path to all destinations from a single source is needed. However, what happens when the problem statement is changed to finding the shortest path from a source to a particular destination?

Let’s take the same video game example. Suppose the destination is in the vicinity of the source. In that case, we start covering the destinations in the third or fourth iteration. However, what if the destination is cell D4?

Source to particular Destination

We start exploring the vertex only in the 12th iteration.

In every iteration, the Dijkstra algorithm is designed to find the path for the unexplored vertex closest to the source. So, it needs many iterations before it reaches a far-away vertex.

It does not have a mechanism to ignore the ones that are far away from the destination, like A1, A2, A3, A4 etc. Other algorithms were developed to address this concern, and of course, I intend to write a follow-up piece on one such algorithm ;-)

One must remember even those algorithms stand on the shoulders of Dijkstra’s version.

Part 3: Dijkstra Algorithm Implementation in Neo4J Graph Database

We focused on the underlying Graph Theory in the previous part. However, performing these iterations when there are millions of vertices is not possible using PowerPoint, the way I have created the previous snapshots :-)

This is where Graph Databases come into play. They are tuned to do all the heavy lifting and deliver performance at scale.

We’ll use the Neo4j Graph Database for this purpose. To be more precise, I am using the Neo4j AuraDS that comes pre-packaged with several graph algorithms.

Let us take our video game example for the data set. I’ll be using the Cypher Query Language, the native language that is used to communicate with the Neo4j Graph Database.

Step 1 — Create the cells

CREATE
(nodeA1:Cell{cId :'A1'}),(nodeA2:Cell{cId :'A2'}),(nodeA3:Cell{cId :'A3'}),
(nodeB1:Cell{cId :'B1'}),(nodeB3:Cell{cId :'B3'}),
(nodeC1:Cell{cId :'C1'}),(nodeC2:Cell{cId :'C2'}),(nodeC3:Cell{cId :'C3'}),
(nodeC4:Cell{cId :'C4'}),(nodeD1:Cell{cId :'D1'}),(nodeD2:Cell{cId :'D2'}),
(nodeD3:Cell{cId :'D3'}),(nodeD4:Cell{cId :'D4'}),(nodeE2:Cell{cId :'E2'});

Step 2— Create the link between the cells


MATCH
(nodeA1:Cell{cId :'A1'}),(nodeA2:Cell{cId :'A2'}),(nodeA3:Cell{cId :'A3'}),
(nodeB1:Cell{cId :'B1'}),(nodeB3:Cell{cId :'B3'}),
(nodeC1:Cell{cId :'C1'}),(nodeC2:Cell{cId :'C2'}),(nodeC3:Cell{cId :'C3'}),
(nodeC4:Cell{cId :'C4'}),(nodeD1:Cell{cId :'D1'}),(nodeD2:Cell{cId :'D2'}),
(nodeD3:Cell{cId :'D3'}),(nodeD4:Cell{cId :'D4'}),(nodeE2:Cell{cId :'E2'})
CREATE
(nodeB1)-[:LINKED_TO{distance :1}]->(nodeA1),(nodeA1)-[:LINKED_TO{distance :1}]->(nodeA2),
(nodeA2)-[:LINKED_TO{distance :1}]->(nodeA3),(nodeA3)-[:LINKED_TO{distance :1}]->(nodeB3),
(nodeB1)-[:LINKED_TO{distance :1}]->(nodeC1),(nodeB3)-[:LINKED_TO{distance :1}]->(nodeC3),
(nodeC1)-[:LINKED_TO{distance :1}]->(nodeD1),(nodeC1)-[:LINKED_TO{distance :1}]->(nodeC2),
(nodeD1)-[:LINKED_TO{distance :1}]->(nodeD2),(nodeD2)-[:LINKED_TO{distance :1}]->(nodeD3),
(nodeD3)-[:LINKED_TO{distance :1}]->(nodeD4),(nodeD2)-[:LINKED_TO{distance :1}]->(nodeE2),
(nodeC2)-[:LINKED_TO{distance :1}]->(nodeC3),(nodeC3)-[:LINKED_TO{distance :1}]->(nodeC4),
(nodeC2)-[:LINKED_TO{distance :1}]->(nodeD2),(nodeC3)-[:LINKED_TO{distance :1}]->(nodeD3),
(nodeC4)-[:LINKED_TO{distance :1}]->(nodeD4)

Let us now check how data is recorded in the database

MATCH (a)-[r]->(b) RETURN a,r,b

Neo4j Graph Data Science has a provision for storing data in memory, i.e., not on the disk but on the memory of the hardware. Analytical processing of information stored in memory is an order of magnitude faster than disk. This temporarily saved data in the memory is called a “Projection.”

Ok. Let’s project!

CALL gds.graph.project(
'myGraph',
'Cell',
'LINKED_TO',
{
relationshipProperties: 'distance'
}
)

Dijkstra’s Shortest Path: Single Source ( Cell B1 ) to All Target Cells

Here is the query to return the shortest path

MATCH (source:Cell{cId :'B1'})
CALL gds.allShortestPaths.dijkstra.stream('myGraph', {
sourceNode: source,
relationshipWeightProperty: 'distance'
})
YIELD index, sourceNode, targetNode, totalCost, nodeIds, costs, path
RETURN
index,
gds.util.asNode(sourceNode).cId AS SourceCell,
gds.util.asNode(targetNode).cId AS TargetCell,
totalCost AS ShortestDistance,
[nodeId IN nodeIds | gds.util.asNode(nodeId).cId] AS ShortPath
ORDER BY index

This is what we get as a response from the Database.

Response to Query from Neo4J DS

Dijkstra’s Shortest Path: Single Source ( Cell B1 ) to Specific Destination (D4)

MATCH (source:Cell{cId :'B1'}), (target:Cell{cId :'D4'})
CALL gds.shortestPath.dijkstra.stream('myGraph', {
sourceNode: source,
targetNode: target,
relationshipWeightProperty: 'distance'
})
YIELD index, sourceNode, targetNode, totalCost, nodeIds, costs, path
RETURN
index,
gds.util.asNode(sourceNode).cId AS SourceCell,
gds.util.asNode(targetNode).cId AS TargetCell,
totalCost AS TotalDistance,
[nodeId IN nodeIds | gds.util.asNode(nodeId).cId] AS ShortPath,
costs AS DistanceEveryStep
ORDER BY index

Let’s not forget to delete, a.k.a ‘Drop’ our projection from the memory

CALL gds.graph.drop('myGraph') YIELD graphName;

If I re-run the previous queries, we’ll receive a friendly error message :-)

And….. That’s a wrap, dear reader, and here is the Key takeaway :

Next time you appear lost in a café while your spouse is speaking to you, there’s a new excuse

Oh dear, I was in the middle of brewing my own algorithm ;-)

Thank you very much for reading this far. I would be pleased if you liked the content, and it would be an honour when you share it.

To know more about the writer, check out the following link :

https://www.linkedin.com/in/balaje-shrinivaas/

References :

  1. On the History of the Shortest Path Problem — Alexander Schrijver 2010 Mathematics Subject Classification: 01A60, 05–03, 05C38, 05C85, 90C27
  2. Letters to the editor: go to statement considered harmful — Edsger W.Dijkstra Communications of the ACM Volume 11 Issue 3 March 1968pp 147–148
  3. Dijkstra, E.W., 1959. A note on two problems in connexion with graphs. Numerische Mathematik, vol. 1: Springer Berlin / Heidelberg, pp. 269–271, 1959
  4. Edsger W.Dijkstra: Brilliant, Colourful, and Opinionated, published by CWI: Research Institute for Mathematics & Computer Science in the Netherlands
  5. Interview conducted by Phil Rana, Researcher at Charles Babbage Research Institute and Published in August 2010, Vol. 53 . №8, Communications of the ACM Titled ‘ An Interview with Edsger W.Dijkstra’
  6. Cornell University, Java & DS Course, Shortest Path History by David Gries

--

--

Balaje Shrinivaas

Passionate about Technology , Data Science, Economics and Capital Markets