Girvan Newman Algorithm: Simpler Modularity Calculation

Joshua
smucs
Published in
2 min readApr 30, 2022

Hello! My name is Joshua Ayodele, and I am a third-year Computer Science student attending Southern Methodist University. Some of my hobbies include playing Texas Hold’em Poker and Chess.

Part 1 — Introduction

I. Girvan Newman Algorithm and Community Detection

The rise of social networks has been one of the most significant phenomena in the past two decades. They have gained massive attention primarily due to the success of online media sites such as Facebook, Instagram, Twitter, and Snapchat. On these networking sites, intricate and complex relationships occur among many individual entities that share a common point or node. In Computer Science, community detection or graph partition allows us to reveal these hidden relations among nodes in the network. In simpler terms, it is a type of unsupervised learning task that determines community groups based on multiple variables such as shared interests, occupation, modules, and their hierarchical organization, using the information encoded in a graph topology. Finding communities from a social network is an extremely difficult task as many communities overlap with one another. Many algorithms have been developed in the past to complete such a problem. One of the most famous is called the Girvan Newman Algorithm. This algorithm was first introduced by Michelle Girvan and Mark Newman and would prove to be a sufficient solution.

II. How the Girvan Newman Algorithm Works

The idea behind the algorithm was to find which edges in a network occur most frequently between other pairs of nodes. One does this by finding the edge-betweenness. Girvan Newman can be broken up into four main parts:

  1. For every edge in a graph, calculate the edge-betweenness
  2. Remove the edge with the highest betweenness.
  3. Recalculate edge betweenness for remaining edges.
  4. Lastly, Repeat steps 2 – 4 until all edges are removed.

The following is pseudocode that demonstrates this logic:

In order to calculate the edge-betweenness, it is vital to find all the shortest paths in the graph. The algo starts with one vertex, calculates the weight of the edge for ways that go through that node, and finally repeats for every node in the graph and sums the weights for each edge. Although simple, the algo does have some limitations. For starters, it is not very efficient with networks that contain many nodes and edges. Communities in vast and complex networks are already difficult to detect, leading to Girvan Newman not being an ideal candidate for larger datasets. Part 2 of this topic is located at https://medium.com/@millerct/d0a237b4d278. Give it a read as well!

References

  1. https://networkx.guide/algorithms/community-detection/girvan-newman/
  2. https://www.sciencedirect.com/topics/computer-science/community-detection#:~:text=Community%20detection%2C%20also%20called%20graph,Lancichinetti%20and%20Fortunato%2C%202009).

--

--