User Segmentation Based on Node Roles in the Peer-to-Peer Payment Network

Utilize the Neo4j Graph Data Science library to identify node roles and use them as features for the user segmentation process

Peer-to-peer payment network. Image by the author.
User segmentation process. Image by the author.
  • Average transaction amount
  • Years since first transaction
  • Weighted in-degree (total amount received)
  • Weighted out-degree (total amount sent)
  • Betweenness centrality
  • Closeness centrality

Environment setup

  • graphdatascience: Neo4j Graph Data Science Python client
  • seaborn: Visualization library
  • scikit-learn: We will use t-SNE dimensionality reduction

Setting up the connection to Neo4j

Feature engineering

Sample network where nodes are colored based on the Betweenness centrality from white (smaller score) to red (higher score). Image by the author.
Sample network where nodes are colored based on the Closeness centrality from white (smaller score) to red (higher score). Image by the author.

Feature exploration

Structure of the features_df dataframe. Image by the author.
Distributions of the features. Image by the author.
Structure of the pivot_features_df dataframe. Image by the author.
Results of the describe method. Image by the author.
Correlation matrix. Image by the author.

K-means clustering

Inspect cluster results

Cluster size results. Image by the author.
Distribution of weighted out-degree per cluster. Image by the author.
Feature statistics of power user community. Image by the author.
Scatter plot visualization of identified clusters. Image by the author.

Conclusion

--

--

Developer Content around Graph Databases, Neo4j, Cypher, Data Science, Graph Analytics, GraphQL and more.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Tomaz Bratanic

Data explorer. Turn everything into a graph. Author of Graph algorithms for Data Science at Manning publication.