Base Implementation of the Girvan-Newman Algorithm (GMZ Series Part 1)

Oliver Z
smucs
Published in
2 min readApr 30, 2022

Introduction to Graph Vocabulary:

A network can be represented by a graph with vertices (nodes) that are linked to each other by edges.

Two vertices are adjacent vertices if they are connected to each other.

An edge and its endpoint are said to be incident. When we talk about the degree of a vertex, we mean the number of edges incident to that vertex, so the degree of a vertex is the number of edges that have that vertex as its endpoint. When considering loops in degrees of vertices (an edge that connects a vertex to itself), you count the loop twice.

A graph is regular if all of the vertices have the same degree.

What is the Girvan-Newman Algorithm?

Named after Michelle Girvan and Mark Newman, it is a hierarchical method used to detect communities in networks[1].

The Girvan Newman Algorithm essentially takes these networks from above and breaks them down into communities. Why do we care about this? There are lots of things we can do by breaking down networks into communities, for example, by identifying critical groups across a network social networks can deliver on their promise of providing content based on a person’s interests [1].

According to Jana Coronicová Hurajová, a leading researcher in this field, the algorithm “is based on the successive deletion of edges which have the maximum edge betweenness centrality which is the quantity measuring the frequency of appearance of an edge on geodesic paths in a graph.”[2]

The Girvan Newman essentially finds edges with this “maximum edge betweenness” (which are basically the edges joining communities) and it deletes them. By eliminating these edges, we can see individual, centralized communities come to the light. It’s like if you are unknotting your earbuds and you look for the part of the wire that is most common throughout the entire knot. You can find the other individual knots and handle them accordingly by removing the larger knots first. This is basically what the Girvan Newman Algorithm does.

The Algorithm also makes use of modularity, which is a measure of how well-defined communities are in a graph. Generally, and depending on the formula used to calculate modularity, its value can range from [-1, 1]. A graph with well-defined communities will have a value between 0.3 and 0.7. Modularity is also used to determine the performance of the algorithm when compared to the “real” communities formed by the graph.

How did we implement the algorithm?

To implement the Girvan Newman Algorithm, we closely followed NetworkX’s documentation about the Girvan Newman algorithm. [3] The specific pseudocode is:

REPEAT

LET n BE number of edges in the graph

FOR i=0 to n-1

LET B[i] BE betweenness centrality of edge i

IF B[i] > max_B THEN

max_B = B[i]

max_B_edge = i

ENDIF

ENDFOR

remove edge i from graph

UNTIL number of edges in graph is 0

Note that for our implementation itself, we used the Boost Library[4].

--

--