Community detection in a graph using Louvain algorithm with example

An important community detection algorithm for graphs & networks

Mehul Gupta
Data Science in your pocket
5 min readJun 18, 2022

--

While going through my Graph Analytics blog series, I read about the Louvain algorithm & felt it deserves a separate discussion (due to its complexity & scarce resources on the internet) & hence this post discusses how Modularity for a graph is calculated & how the Louvain algorithm works.

My debut book “LangChain in your Pocket” is out now !!

Find the vlog version of this post below

The most popular community detection algorithm in the space, the Louvain algorithm is based on the idea of graph (component) density i.e. something related to edges/connections frequency within a component compared to other components in the same graph. A node is added to a community if it improves its density else not. We call this term closely related to density as Modularity

How is this Modularity calculated?

It has a very complex formula

Q = 1/(2m) * Σⱼ ([ Aᵢⱼ — kᵢkⱼ / (2m)] * δ(cᵢ, cⱼ))

The above formula requires some explanation

Q= Modularity

m = Weights of all edges in the graph (if unweighted graph, the count of edges in the entire graph)

The summation has 3 terms

Aᵢⱼ = weight of the edge between node ‘i’ & node ‘j’ if weighted graph else 1 if edge between node i & j exists else 0 (in case of unweighted graph)

Kᵢ is the degree (total connections) of some node i.

δ(c, c) = 1 if both node i & node j are in same community else 0

So, the above summation term is 0 if i & j are from different communities.

I guess an example would do be better !!

Assuming the above graph, let’s understand the modularity formula using it

Modularity =

1/2m x [(A₁₁ — k₁k₁/2m) x δ(c₁,c₁)+(A₁₂ — k₁k₂/2m) x δ(c₁,c₂) + (A₁₃ — k₁k₃/2m) x δ(c₁,c₃) + (A₁₄ — k₁k₄/2m) x δ(c₁,c₄) + (A₁₅ — k₁k₅/2m) x δ(c₁,c₅) +

(A₂₁ — k₂k₁/2m) x δ(c₂,c₁) + (A₂₂ — k₂k₂/2m) x δ(c₂,c₂)+ ……

(A₅₄ — k₅k₄/2m) x δ(c₅,c₄)+ (A₅₅ — k₅k₅/2m) x δ(c₅,c₅) ]

Putting values in it

= 1/2*5 x [ (0–2*2/2*5) x 1 + (1–2*2/2*5)x 0 + (1–2*2/2*5)x0 + (0–2*2/2*5) x 0 + (0–2*2/2*5) x 0+

((1–2*2/2*5)x 0) + (0–2*2/2*5) x 1 ……

((1–2*2/2*5)x 0) + (0–2*2/2*5) x 1]

We will leave the calculation for you to calculate !!

But let’s discuss how the values were placed in the equation

  • m = 5 (total edges in graph=5, unweighted hence no weights considered)
  • For the term (A₁₂ — k₁k₂/2m) x δ(c₁,c₂), we have A₁₂=1 as there is an edge between 1 & 2. As the degree for every node in the graph is the same (2 edges for each node), the k₁k₂/2m term remains constant for every term of the summation. δ(c₁,c₂) is 0 as initially, the 2 nodes don’t belong to the same community, hence the entire summation term becomes 0!
  • For the term (A₁₁ — k₁k₁/2m) x δ(c₁,c₁), A₁₁=0 as there is no self-edge for node 1 to itself. In fact, all the Aᵢᵢ = 0 as there exists no self-edge in this graph. δ(c₁,c1)=1 as every node will belong to its own community.

A couple of points to note

  • δ(c₁,c2) will be 1 for every self node pair calculation (cases like i — i,

j — j) while Aᵢⱼ will be 0 in such terms if no self-edge exists

  • The summation term includes all possible node pairs, including self nodes (i — i). Hence if the graph has 5 nodes, we will have 25 summation terms

Are we done?

Not really, as this was just about how modularity is calculated, we haven’t started with Louvain yet !!

So Louvain algorithm involves the below steps

  • Initially, assign each node a unique community such that total nodes = total unique communities
  • Now, in an iterative manner, we will try assigning every node ‘i’ to its neighboring node ‘j’ community & recalculate the Modularity of the graph. If modularity improves as compared to when node ‘i’ wasn’t in the ‘j’ node’s community, we will assign ‘i’ to the community the same as ‘j’ else not. Remember, j is i’s neighbor
  • This will be repeated for multiple iterations until unless no further gain is observed in Modularity by moving any node to its neighbor’s community & we have reached a Maxima for modularity (maybe local maxima but fine)
  • Once done, what we do is create a new graph such that
  1. All nodes in their respective communities are clubbed to form a single node
  2. All edges connecting nodes of community 1 to nodes of community 2 in the previous iteration are clubbed to form a single edge between Community 1 & Community 2 & the edge weight for this new edge is equal:
  • Summation of weights of all edges between nodes of community 1 to nodes of community 2 in case of a directed graph
  • Edge count in case of an unweighted graph

3. Self edges (like i — i) becomes self edges for their respective communities with weight summation of all such self edges for the nodes of some community ‘1’ as the weight of self-edge for the particular community or count of all self edges in case of an unweighted graph.

We, again go back to the previous iteration where we calculate Modularity for each node of this new graph by moving nodes to the same community as the neighboring node’s community. If Modularity increases, this node is assigned to the neighbor’s community as earlier else not & this process of

Calculate Modularity & assign community to nodes

New graph formation depending on communities formed

Continues until no further change in communities is observed. The final communities are the communities of interest to us !!

The below example from the actual research paper may help

https://arxiv.org/abs/0803.0476

Let’s consider the above graph

  • We started off with separate communities for each node (image 1, left corner)
  • After iterating over nodes & calculating modularity by changing communities for nodes, we finally got 4 communities (Red, Blue, Sky Blue & Green), Image 2
  • In the new graph formed, we now have 4 nodes (representing the 4 communities found) with edge weights as aggregation discussed above.
  • Again the entire cycle of finding communities using Modularity starts in this new graph now where we are able to club these 4 communities into 2 (Last image, Right)
  • As no further communities are found, therefore we end up with 2 communities.

That’s a wrap for the day !!

--

--