Graph Neural Networks: the message passing algorithm

Rubens Zimbres
Google Developer Experts
6 min readJun 5, 2023


Graph Neural Networks are a type of neural networks used in data presented as graphs. Graphs are spacial structures made of vertices (nodes) and edges. There are many structures that are represented as graphs: structures in the three-dimensional space (x,y,z) as molecules of substances (caffeine, for instance), proteins (made of amino acids), DNA, computer networks and also structures like social networks. Here are some examples, made with Wolfram Mathematica:

Molecular structure of Caffeine
XYZ coordinates of atoms in a protein
Social Network
Social Network Communities

Basically, each node represents a person, an atom, a financial transaction and these nodes are connected through edges, that establish a relationship between these entities. Among people, this could be the strengthness of the tie, the social distance, the level of intimacy. Among atoms in a molecular structure, these edges may be covalent bonds. In financial transactions, these edges may define the distance of someone from a fraudulent transaction.

Considering the social network example (picture above), we have clusters of people densely connected that may be connected to an “influencer”, and also weak links (weak bonds), that connect different clusters of people, allowing for diversity of information. When we talk to each other, personally or via social media, our message travels through this social network and may be subject to deformations in its content, misinterpretations along the way. The same happens with atoms and their electromagnetic properties: the closer other atoms are, the more they will be influenced by these electromagnetic properties. Thus, after some distance, this influence fades away. Also, if this influence is allowed to permeate all the network structure, the whole network may converge to a single state, due to saturation.

But how can we represent these complex relationships in a mathematical way we are able to model these interactions? Initially, we should define the connections between each one of the actors. This is done through an Adjacency Matrix, where the same individuals are placed in the lines and columns of this matrix:

Network structure according to the Adjacency Matrix

Every number 1 in this adjacency matrix means a connection. We have a 5 x 5 matrix, where nodes 1 to 5 are placed in the lines and also in the columns. So, if you take individual 2, he is only connected to individual 5. Individual 1 is connect to individuals 3 and 5, and so on. To plot this network, I used the following code:

import numpy as np
import networkx as nx

Adj = np.array(
[[0, 0, 1, 0, 1],
[0, 0, 0, 0, 1],
[0, 0, 0, 1, 1],
[0, 0, 1, 0, 1],
[1, 1, 0, 0, 0]]
g = nx.from_numpy_array(Adj)
pos = nx.circular_layout(g)

fig, ax = plt.subplots(figsize=(8,8))
nx.draw(g, pos, with_labels=True,
labels={i: i+1 for i in range(g.number_of_nodes())}, node_color='#f78c31',
ax=ax, edge_color='gray', node_size=1000, font_size=20, font_family='DejaVu Sans')

Now we will multiply the Adjacency Matrix by a vector composed by the number of lines. So, we will have a 5 x 5 Matrix multiplied by a 5 x 1 vector. This means that n x p times p x m will give us a n x m vector. In this case, the 5 x 1 vector:

H = Adj @ np.array([1,2,3,4,5]).reshape(-1,1)

Note that in order to do this multiplication, you will transpose the p x m vector to [1,2,3,4,5] and multiply element-wise to each element of that line of the adjacency matrix and sum. The result is the sum of connected neighborhoods. Hold the H for a while.

Now we will find the Diagonal Degree Matrix, that is composed by the neighborhood sizes in the diagonal, the sum of each one of the columns in the matrix:

D = np.zeros(Adj.shape)
np.fill_diagonal(D, Adj.sum(axis=0))
Diagonal Degree Matrix

Now we will assign a weight to each edge. We do this by dividing the Identity Matrix by the Diagonal Degree Matrix.

D_inv = np.linalg.inv(D)
Inverted Degree Matrix

By multiplying inverted D by the Adjacency Matrix we will obtain an averaged Adjacency Matrix:

Averaged Adjacency Matrix

The concept of averaging is very important when we are dealing with a node that has not a single value, but is a collection of feature vectors, like a Graph Convolutional Network.

However, what we really want to operationalize is the Message Passing algorithm, represented by the following:

A hat applied repeatedly applied will allow the information to flow in the Graph Network. Given that A tilde equals Adjacency Matrix plus Identity Matrix, we have:

g = nx.from_numpy_array(Adj)
Adj_tilde = Adj + np.eye(g.number_of_nodes())

Now we need to create the square root of D tilde. We create a matrix with zeros, and add the values of the sum of lines of the Adjacency Matrix tilde in the diagonal.

D_tilde = np.zeros_like(A_tilde)
np.fill_diagonal(D_tilde, A_tilde.sum(axis=1).flatten())

Then we calculate the inverse square root of D tilde:

D_tilde_invroot = np.linalg.inv(sqrtm(D_tilde))

Now that we already have A tilde, and also the inverse square root of D tilde, we can calculate A hat:

A hat
A_hat = D_tilde_invroot @ A_tilde @ D_tilde_invroot

Note that @ in numpy means the same as matmul.

A hat result

Now we will implement the Message Passing algorithm. Let’s start from the message vector we have (H) and check how it flows in the Graph Network. We know that:

H = Adj @ np.array([1,2,3,4,5]).reshape(-1,1)

Now we make the information flow in the Graph Network:

epochs = 9
information = [H.flatten()]
for i in range(epochs):
H = A_hat @ H

Let’s see the flow of information in this heatmap. Note how each individual (x axis) obtains or loses information along time (y axis).

import matplotlib.pyplot as plt

plt.imshow(information, cmap='Reds', interpolation='nearest')

Let’s plot it:

fig, ax = plt.subplots(figsize=(12, 12))
from time import time

for i in range(0,len(information)):
colors = information[i]

g, pos, with_labels=True,
ax=ax, edge_color='gray', node_size=1500, font_size=30, font_family='serif',
vmin= np.array(information).min(), vmax=np.array(information).max())
plt.savefig('/home/user/Downloads/message/foo{}.png'.format(time()), bbox_inches='tight', transparent=True)

import glob
from PIL import Image

fp_in = "/home/user/Downloads/message/foo*.png"
fp_out = "/home/user/Downloads/message100_try.gif"

img, *imgs = [ for f in sorted(glob.glob(fp_in))], format='GIF', append_images=imgs,
save_all=True, duration=1200, loop=0)

Visually, the flow of information in the Graph Network will look like this, on each epoch:

In the plot below we can see how much information each one of the nodes of the network has, along time. Note the convergence of nodes 1, 3, 4 and 5:

For a practical application of the Message Passing algorithm in an Agent-Based Model, please refer to my model at COMSES, made with Python and NetLogo:

Cellular automata model of social networks (Version 1.0.0). CoMSES Computational Model Library. Available at:



Rubens Zimbres
Google Developer Experts

I’m a Senior Data Scientist and Google Developer Expert in ML and GCP. I love studying NLP algos and Cloud Infra. CompTIA Security +. PhD.