# Comments on an implementation of Prim’s algorithm in Ruby

Prim’s alrogithm is used to return a minimum spanning tree. For those not yet in the know, an MST is a tree made from a graph of points connected by undirected weighted edges. The MST must include every node in the graph and it must do so as efficiently as possible; that is, the sum of weights must be as low as possible. Conceptually this is not the most complicated data structure, and its potential real-world applications should be immediately apparent. Prim’s algorithm works as follows: pick an arbitrary node, add that node to ‘visited’ list, choose the smallest edge that connects to an unvisited node, add that node to ‘visited’ list. Continue in this manner, each time choosing the smallest edge that leads to an unvisited node from any visited node, being careful not to create any loops.

In pseudocode, the algorithm looks like this (from Mark Needham’s blog):

- Let X = nodes covered so far, T = edges covered so far, V = all the nodes in the graph
- Pick an initial arbitrary node s — it doesn’t matter which one it is
- while X ≠ V:
- let e = (u,v) be the cheapest edge of the graph where u ∈ X and v ∉ X

i.e. u is a node that has been covered and v a node that has not yet been covered - Add e to T
- Add v to X

Following the pseudocode as closely as possible,we end up with this Ruby code (source: Mark Needham).

`# Prim's Minimum Spanning Tree Algorithm - Naive version`

def file

@file ||= File.readlines("edges.txt")

end

def header

@header ||= file.take(1)[0]

end

def number_of_nodes

@number_of_nodes ||= header.split(" ")[0].to_i

end

def create_adjacency_matrix

adjacency_matrix = [].tap { |m| number_of_nodes.times { m << Array.new(number_of_nodes) } }

file.drop(1).map { |x| x.gsub(/\n/, "").split(" ").map(&:to_i) }.each do |(node1, node2, weight)|

adjacency_matrix[node1 - 1][node2 - 1] = weight

adjacency_matrix[node2 - 1][node1 - 1] = weight

end

adjacency_matrix

end

def find_cheapest_edge(adjacency_matrix, nodes_spanned_so_far, number_of_nodes)

available_nodes = (0..number_of_nodes-1).to_a.reject { |node_index| nodes_spanned_so_far.include?(node_index + 1) }

cheapest_edges = available_nodes.inject([]) do |acc, node_index|

get_edges(adjacency_matrix, node_index).select { |_, other_node_index| nodes_spanned_so_far.include?(other_node_index + 1) }.each do |weight, other_node_index|

acc << { :start => node_index + 1, :end => other_node_index + 1, :weight => weight }

end

acc

end

cheapest_edges.sort { |x,y| x[:weight] <=> y[:weight] }.first

end

def get_edges(adjacency_matrix, node_index)

adjacency_matrix[node_index].each_with_index.reject { |edge, index| edge.nil? }

end

def select_first_edge(adjacency_matrix)

starting_node = 1

cheapest_edges = get_edges(adjacency_matrix, 0).inject([]) do |all_edges, (edge, index)|

all_edges << { :start => starting_node, :end => index + 1, :weight => edge }

all_edges

end

cheapest_edges.sort { |x,y| x[:weight] <=> y[:weight] }.first

end

def nodes_left_to_cover

(1..number_of_nodes).to_a - @nodes_spanned_so_far

end

# Prim's algorithm

adjacency_matrix = create_adjacency_matrix

first_edge = select_first_edge(adjacency_matrix)

@nodes_spanned_so_far, @edges = [first_edge[:start], first_edge[:end]], [first_edge]

while !nodes_left_to_cover.empty?

cheapest_edge = find_cheapest_edge(adjacency_matrix, @nodes_spanned_so_far, number_of_nodes)

@edges << cheapest_edge

@nodes_spanned_so_far << cheapest_edge[:start]

end

puts "edges: #{@edges}, total spanning tree cost #{@edges.inject(0) {|acc, edge| acc + edge[:weight]}}"

The adjacency_matrix method simply prints a matrix with edge weights at certain (x,y) positions, indicated by integers in the adjacency matrix. An arbitrary first node is selected. Interestingly, our method to create an adjacency matrix from an edge list is used primarily by select_first_edge, which is called by the get_edges helper method. I’m left wondering if there’s a more efficient way to get edges from the text file, but it probably depends on the formatting of the document. Regardless, the code looks quite nice, and is pretty human-readable.

It’s difficult to test Mark’s code without knowing the format his data was stored in. Trying to modify the code to work on a list of comma-separated values didn’t work out very well. Significantly, the code relies on vertices ID’s as indexes when creating the adjacency matrix, so tables with long strings of digits as ID’s (as opposed to integers 1–10, for example) throw errors. Nonetheless the implementation is simple enough to read along with and follow, and it’s a good example of the way that an algorithm can be quickly turned into pseudocode which can then be transformed into executable Ruby.