Attempt to Update Only Affected Communities In Girvan-Newman Algorithm Using Boost Graph Library

Miffy Liu
smucs
Published in
3 min readApr 30, 2022

Miffy Liu is a second-year Computer Science and Creative Computing student at Southern Methodist University.

Her desired track is Cybersecurity. Her future plan is to become a game designer.

Introduction

If we visualize our social network like a graph, where each node on the graph represents a person, and each edge on the graph represents a form of connection between two people (be it friendship or any kind of relationship)
then we can see, clearly, that some nodes are more closely related to each other than to other nodes.

We call those clustering nodes communities. So, how can we identify individual community in this massive interconnected network?

Community detection is one of the most practiced and revised algorithms for detecting patterns in graph . By rewriting properties of edges in a complex graph, we can analyze the .

For project 3, we were asked to find the possible communities in a network by implementing the Girvan-Newman algorithm in an unweighted, undirected graph.

According to Girvan and Newman, the basic steps of the GN Algorithm are:

1. Calculate the betweenness for all edges in the network.

2. Remove the edge with the highest betweenness.

3. Recalculate betweennesses for all edges affected by the removal.

4. Repeat from step 2 until no edges remain.

It is said that the worst case computation time is O(m^2 * n). However, if we recalculate only the edges, or rather, only the communities that are affected by the removal, the computation time could potentially increase.

Let’s consider this case, where some edges had already been removed, and the graph has been split into multiple communities:

img ref: https://community.neo4j.com/t/count-the-number-of-isolated-clusters-in-subgraph/33240

The algorithm will most likely choose to break down the biggest, most clustered community, BL, on the bottom left, and then recalculate the edge betweenness.

But we want it to calculate only the related edges — which are all edges that exist within BL. We don’t want the algorithm to calculate other edges as they will remain unchanged.

Here comes a question: on the first run, the algorithm was intended to go through all edges in the graph.

1. Calculate the betweenness for all edges in the network.

So how can we modify the algorithm that we only calculate the edges in the same community?

Implementation

  1. Pass the paired vertices (source,target) of the removed edge E as parameters into our breadth-first-search function
  2. Update the only the communities that have the passed vertices. If the removed edge causing one community to break down, calculate them separately.

Pseudocode:

void BFSCont(const Graph& g,Vertex source, Vertex target){
int groupIndexS = component[source];
updateGroup(g, groupIndexS);
int groupIndexT = component[target];
updateGroup(g, groupIndexT);
return findMaxCentralityEdge(g);}void updateGroup(Graph g,Vertex groupIndex){
Community = findCommunity(groupIndex);
for(Edge i : Community){
updateBetweenness(i);
}
}

Performance

I tested the original(notByGroup) and the improved(byGroup) algorithm on different vertex and edge sizes.

The Number of Vertices vs Computation Time (ms)

The Number of Vertices vs Computation Time(ms)

The Number of Edges vs Computation Time (ms)

The Number of Edges vs Computation Time(ms)

Conclusion

It is clear from the output graph that the computation time is less for the improved algorithm, albeit not by much. I expect the computation time will be significantly better with more edges and vertices, as it will cost more time to iterating through every affected and unaffected community every time an edge is removed.

References

Girvan, M., & Newman, M. E. (2002). Community structure in social and biological networks. Proceedings of the national academy of sciences, 99(12), 7821–7826.

Newman, M. E. (2006). Modularity and community structure in networks. Proceedings of the national academy of sciences, 103(23), 8577–8582.

--

--

Miffy Liu
smucs
0 Followers
Writer for

github username: 19miffyliu. Second-year CS student in SMU.