Girvan-Newman Part 2 — Our Method

Maxwell Calvert
smucs
Published in
3 min readApr 14, 2022

Links to other parts: Part 1, Part 3

By Maxwell Calvert

Spring 2022

Resources Used

For our implementation of the Girvan Newman community detection algorithm, we coded it in C++ using the Boost Graph Library. The Boost Graph Library (BGL) is a C++ library that includes a wide range of functionality and implementations for graph data structures and algorithms. The graphs being read and handled by the BGL are in .graphml format, which contains all of the vertex and edge information for the graph.

Snippets of the .graphml file format

Our implementation

Our betweenness algorithm uses a map to track each edge present within a graph and its associated betweenness value.

std::map<Edge, int> betweenness_map;
auto bet_property_map = boost::make_assoc_property_map(betweenness_map);

Utilizing boost visitors, a list of shortest paths is generated for each vertex in a graph. For each edge present in this list, we simply increment the value for the corresponding edge in our map container. This is then done for every vertex in the graph such that each edge can now be associated with its overall betweenness. E.g.:

for(vertex i in Graph){
breadth_first_search(visitor(record_predecessors));
for(vertex j in Graph){
Edge e = edge present in predecessors[j];
map[e]++;
}
}

We then followed the basic Girvan-Newman algorithm and deleted the edge(s) with highest betweenness (note that we separately implemented both single and multiple edge deletion). Because we knew in advance the number of communities in our testing datasets, we could utilize boost’s connected_components function to determine when to stop our algorithm. Once this detected the proper amount of communities, we could stop deleting edges and thereby be left with a graph separated into disconnected clusters.

Before and after removal of interconnecting edges

Efficiency Comparison

An attempt was also made to compare our betweenness algorithm with Brandes’ Betweenness Centrality provided in the boost library. In comparing runtimes of the entire algorithm when the betweenness algorithms were swapped out, we discovered our own betweenness algorithm completing significantly faster than when using brandes_betweenness_centrality (to the tune of ~0.2 second runs for ours versus ~6 second runs on the same graphs for Brandes). The extreme disparity here led us to believe that there was some misunderstanding on our part of how boost’s algorithm works or perhaps some greater level of functionality added to boost’s that is not present in our own algorithm that might cause such a disparity. Ultimately, we decided the comparison was not helpful in measuring the efficiency of our own betweenness algorithm.

References

[1] The Boost Graph Library — 1.78.0

[2] Girvan-Newman algorithm | NetworkX Guide

[3]https://www.researchgate.net/publication/271472127_Community_structure_in_networks_Girvan-Newman_algorithm_improvement

[4] smu-cs-3353/22s-pa03-girvan-newman-ian_maxwell: 22s-pa03-girvan-newman-ian_maxwell created by GitHub Classroom

Writer Bio

Maxwell Calvert is an SMU graduate with degrees in Computer Science and Data Science. He is passionate about programming, growing personal skills, music, and gaming.

--

--