Algorithms in Scala: Dijkstra Shortest Path

Published in

SE Notes by Alexey Novakov

5 min readNov 11, 2018

Some time ago I was studying programming algorithms courses at Coursera from Princeton University. Both courses are quite well structured and have lots of visualization on how particular algorithms work (sorting, searching, etc.). Courses are in Java, so it was easy for me to debug and tweak the code samples as well as my code assignments. Approximately at the same time I started programming in Scala (around 2015). Unfortunately, I could not submit code assignments not in Java due to Coursera style checker, which could downgrade overall rank of my solution.

Now jumping back to present time. The other days, I have been preparing myself for different job interview questions as well for algorithms, which are actually not my favorite type of questions. Anyways, I decided to translate the Shortest Path problem from Java to Scala. Code is based on Algorithms Book 4th edition.

Problem

We need to find a shortest path from some given vertex ‘v’ to destination vertex ‘w’. For example, we want to find shortest path from vertex 0 . to vertex 6. Answer should a path [0, 2, 7, 3, 6] with total distance or sum of weights as 1.51. Challenges:

Since we have multiple paths from ‘v’ to ‘w’ we have to search the one among of them and select the shortest one.
We need to respect the direction of edges between particular vertices while looking for a path between ‘v’ and ‘w’.
Some vertices could be unreachable, i.e. there is no path(s) to them. We will need to return specific constant value to indicate such situation.

Dijkstra Solution

Dijkstra’s algorithm solves the single-source shortest-paths problem in edge-weighted digraphs with nonnegative weights using extra space proportional to V and time proportional to E log V (in the worst case).

Going further, there are other algorithms to solve the shortest path problem with negative weights as well in acyclic graph, but they are not part of the post.

Implementation in Scala

Implementation is almost similar to what Robert Sedgewick and Kevin Wayne made in Java in their book. I have tried to make the code look more functional with a bit of ‘local‘ mutability. My googling for Scala Functional Dijkstra implementation found quite unnecessary complex implementation which was really hard to understand. Anyways, let’s see how it went:

Above we have defined two case classes to represent Graph and its Edges. I decided to use Map for adjacency list to simulate dynamically growing array. Original Java code is based on array, with size based on input parameter. My implementation is a bit more flexible. It allows not worry about number of vertices by adding new edges calling addEdge method. This method creates new copy of the Graph with just passed edge. However, Map will be using extra memory space due to its internal capacity approach. We know from Data Structures university course that Map must not be always full in order to be effective.

Above data structures allow us to build a graph like this:

import dijkstra.EdgeWeightedDigraphOps._

val g = EdgeWeightedDigraph()
  .addEdge(DirectedEdge(4, 5, 0.35))
  .addEdge(DirectedEdge(5, 4, 0.35))
  .addEdge(DirectedEdge(4, 7, 0.37))
  .addEdge(DirectedEdge(5, 7, 0.28))
  .addEdge(DirectedEdge(7, 5, 0.28))
  .addEdge(DirectedEdge(5, 1, 0.32))
  .addEdge(DirectedEdge(0, 4, 0.38))
  .addEdge(DirectedEdge(0, 2, 0.26))
  .addEdge(DirectedEdge(7, 3, 0.39))
  .addEdge(DirectedEdge(1, 3, 0.29))
  .addEdge(DirectedEdge(2, 7, 0.34))
  .addEdge(DirectedEdge(6, 2, 0.40))
  .addEdge(DirectedEdge(3, 6, 0.52))
  .addEdge(DirectedEdge(6, 0, 0.58))
  .addEdge(DirectedEdge(6, 4, 0.93))

(it is a shame that medium does not have proper code highlighting 🤦🏻‍)

Algorithm Implementation is based on operation called Relaxation.

We maintain two additional arrays:

1- to compute the shortest path distance (which is single value of Double type)

2 - array of Edges which form the the shortest path from some given source vertex to all the rest vertices. Both arrays length are equal to number of vertices in a graph. Let’s see how actual computation looks like:

Key points of the algorithm:

We build two arrays edgeTo and distTo to answer which is short path from source vertex to any given ‘w’ vertex.
We start from source vertex and considering that distance to it is 0.0
We sort other vertices by weight when explore them going from source vertex to adjacent vertices. Look at above PriorityQueue which we use to enqueue new vertices and dequeue just pushed vertices based on min weight.
Once we explored all reachable vertices from source vertex we stop the calculation, i.e. we no loner enqueue new vertices to our auxiliary queue data structure.

Let’s see how to read the path and distance having the computed result.

Reading of distance to destination vertex is quite trivial, we just use distTo array as is.

Reading the shortest past is a bit more tricky, but not so complicated as it could be. We use edgeTo array for that. We start from the last edge pointing to our destination vertex and then going step by step back to the source vertex. We do that by looking at array index which correspond to e.from value.

The answer for shortest path will be constructed like in the below trace:

1. e4(3, 6, 0.52)
2. e3(7, 3, 0.39), e4(3, 6, 0.52) then we prepended one more edge...
3. e2(2, 7, 0.34), e3(7, 3, 0.39), e4(3, 6, 0.52) and one more...
4. e1(0, 2, 0.26), e2(2, 7, 0.34), e3(7, 3, 0.39), e4(3, 6, 0.52) 
Last line is answer from pathTo function

Finding the Shortest Path

If we print state of the queue on every iteration, then we will see how the algorithm explores graph vertices. First we start from source vertex which is an input parameter and then we add new vertices as we go. We remove/dequeue one vertex per iteration which has the min weight in the queue.

Queue state: List((0,0.0))
Queue state: List((4,0.38), (2,0.26))
Queue state: List((7,0.75), (2,0.26), (5,0.73))
Queue state: List((3,1.14), (2,0.26), (5,0.73))
Queue state: List((6,1.66), (2,0.26), (5,0.73))
Queue state: List((5,0.73), (2,0.26))
Queue state: List((1,1.05), (2,0.26))
Queue state: List((2,0.26))
Queue state: List((7,0.60))
Queue state: List((3,0.99))
Queue state: List((6,1.51))

Summary

Current implementation is doing some validation of the source vertex index. However, we need to add more validation checks to make it more reliable. In order to make this implementation production ready we would need to optimize the memory usage, perhaps we could calculate the distances and list of edges without loading the adjacency lists into memory. It will be critical to come with some lazy-loading solution, having a graph of millions edges. Perhaps, this solution already exists, I didn’t google for it yet :-)

Links:

Princetone Algorithms: https://algs4.cs.princeton.edu/44sp/
Source code: https://github.com/novakov-alexey/scala-dijkstra