How to Create Network Graphs in Python?

Saliha Demez
3 min readApr 2, 2023

--

Networks graphs are useful when you want to see how sales of your products or product groups are dependent to each other. For example, if customers who buy A often also buy B, that means they are dependent products.

Firstly you should have a data frame of orders which contains “Customer Name”, “Product Name”, “Price” of the ordered product.

# import necessary libraries
import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
import matplotlib.pyplot as plt
import networkx as nx
import sys
import itertools
import networkx
import plotly.graph_objs as go
# group by customer name and product name, and get the price values

cum_TRY = (df.groupby(["Customer Name", "Product Name"])["Price"]
.sum().unstack().reset_index().fillna(0)
.set_index("Customer Name"))
cum_TRY.head()

Your output should look like this; it shows us which customer spent how much money for which product in total.

# get correlation matrix of cum_TRY
corr_TRY = cum_TRY.corr()
corr_TRY.head()
vertices = corr_TRY.columns.values.tolist() #getting vertices list
edges = [((u,v),cum_TRY[u].corr(cum_TRY[v])) for u,v in itertools.combinations(vertices, 2)]
edges = [(u,v,{'weight': c}) for (u,v),c in edges if c >= 0.5]
# you can change c value between 0 and 1, it will show network graph
# of products which have a correlation coefficient greater than x .
# in that example x=0.5
edges

“edges”, consists of all product combinations you have and their correlation coefficients as below.

# let's start creating network graph, # you can copy paste after that
G = networkx.Graph()
G.add_edges_from(edges)
labels = nx.get_edge_attributes(G,'weight')
final = dict()
for key in labels:
final[key] = round(labels[key], 2)

widths = nx.get_edge_attributes(G,'weight')

widths.update((key, value * 6) for key, value in widths.items())
print(widths)
nodelist = G.nodes()

plt.figure(figsize=(25,25))

#pos = nx.shell_layout(G)
pos = nx.spring_layout(G, k=0.5, iterations=100)
for n, p in pos.items():
G.nodes[n]['pos'] = p
edge_trace = go.Scatter(
x=[],
y=[],
line=dict(width=1, color='#888'),
hoverinfo='none',
mode='lines')
for edge in G.edges():
x0, y0 = G.nodes[edge[0]]['pos']
x1, y1 = G.nodes[edge[1]]['pos']
edge_trace['x'] += tuple([x0, x1, None])
edge_trace['y'] += tuple([y0, y1, None])
node_trace = go.Scatter(
x=[],
y=[],
text=[],
mode='markers+text',
hoverinfo='text',
marker=dict(
showscale=True,
colorscale='YlOrRd',
reversescale=False,
color=[],
size=37,
colorbar=dict(
thickness=7,
title='Node Connections',
xanchor='left',
titleside='right'
),
line=dict(width=0)))
for node in G.nodes():
x, y = G.nodes[node]['pos']
node_trace['x'] += tuple([x])
node_trace['y'] += tuple([y])


node_adjacencies = []
node_text = []
for node, adjacencies in enumerate(G.adjacency()):
node_info = adjacencies[0]
node_adjacencies.append(len(adjacencies[1]))
node_text.append(str(node_info))
node_trace.marker.color = node_adjacencies
node_trace.text = node_text
title = "Network Graph Demonstration"
fig = go.Figure(data=[edge_trace, node_trace],
layout=go.Layout(
title=title,
titlefont=dict(size=16),
showlegend=False,
hovermode='closest',
margin=dict(b=21, l=5, r=5, t=40),
annotations=[dict(
text="Text Here",
showarrow=False,
xref="paper", yref="paper")],
xaxis=dict(showgrid=False, zeroline=False,
showticklabels=False, mirror=True),
yaxis=dict(showgrid=False, zeroline=False,
showticklabels=False, mirror=True)))

fig.update_layout(
autosize=False,
width=1000,
height=800)

fig.show()

And, here we have our network graph. Which shows dependent products by drawing lines between them. Also as we can see, more a product has node connections, darker its node color (you can change it vice versa) and more central.

Using this network graph, we can extract important insights. Accordingly, actions and campaigns with high added value to the company can be created. Why don’t we start by focusing more on marketing the product with the most node connections? If our product with the most connections is sold, many products whose sales are highly correlated are expected to increase automatically.

References

  1. https://plotly.com/python/network-graphs/
  2. https://www.toptal.com/data-science/graph-data-science-python-networkx

--

--