A NetworkX-esque API for Neo4j Graph Algorithms

Mark Needham
Neo4j Developer Blog
4 min readJun 8, 2018

A few years ago when I first started learning Python I came across the NetworkX library and always enjoyed using it to run graph algorithms against my toy datasets.

Nowadays Neo4j has its own Graph Algorithms library but we have to call that via Cypher procedures which isn’t quite as nice as calling it from Python functions.

Update: The O’Reilly book “Graph Algorithms on Apache Spark and Neo4j Book is now available as free ebook download, from neo4j.com

As a result, acouple of months ago I started writing a NetworkX-esque API that would provide a nice wrapper around Neo4j’s algorithms.

It’s still in the experimental stages so if you want to install it you’ll have to do so directly from GitHub. The following command will do the trick:

pip install git+https://github.com/neo4j-graph-analytics/networkx-neo4j.git#egg=networkx-neo4j

One you’ve done that you’ll need to create a driver that points to your Neo4j server.

Launch Graph of Thrones Sandbox

I’m going to show you how to point to a Neo4j sandbox instance that has a preloaded Game of Thrones dataset that we can play with. If you want to follow along you’ll need to go to neo4j.com/sandbox and install the Graph Algorithms Sandbox.

Graph Algorithms Sandbox

Once that’s launched you should see something like this:

Click on the ‘Code’ and then ‘py’ tabs:

Here we have the details that we can use to create our driver. Copy the lines down to ‘driver =’ into your Python script or execute them in your Python terminal.

from neo4j.v1 import GraphDatabase, basic_authdriver = GraphDatabase.driver(
"bolt://54.174.242.100:36186",
auth=basic_auth("neo4j", "invention-airship-gunnery"))

The values for your server will be different than mine so make sure you update them appropriately.

Using networkx-neo4j

Now we’re ready to start using networkx-neo4j. Let’s first import the module:

import nxneo4j

Configuring our graph

Next we’re going to create a map explaining the node labels, relationship types, and properties used in the Graph of Thrones.

config = {
"node_label": "Character",
"relationship_type": None,
"identifier_property": "name"
}
G = nxneo4j.Graph(driver, config)

We set:

  • node_label toCharacter so that we’ll only consider nodes with that label
  • relationship_type toNone so that we’ll consider all relationship types in the graph
  • identifier_property is the node property that we’ll use to identify each node from the networkx-neo4j API

PageRank

Now it’s time to start running some algorithms! We’ll start with the famous PageRank algorithm. Let’s find out who the most influential characters in Game of Thrones are:

sorted_pagerank = sorted(nxneo4j.centrality.pagerank(G).items(), key=lambda x: x[1], reverse=True)
for character, score in sorted_pagerank[:10]:
print(character, score)
Tyrion-Lannister 11.6290205
Stannis-Baratheon 7.2328375
Tywin-Lannister 6.8489435
Varys 6.144811999999999
Theon-Greyjoy 4.654753499999998
Sansa-Stark 4.237233499999999
Walder-Frey 3.2422405
Robb-Stark 3.0707785
Samwell-Tarly 2.9794970000000007
Jon-Snow 2.920541

Hopefully there aren’t too many surprises there!

Shortest Path

What about if we want to find the shortest path between two characters?

nxneo4j.path_finding.shortest_path(G, "Tyrion-Lannister", "Hodor")

['Tyrion-Lannister', 'Robb-Stark', 'Hodor']

Notice that we refer to nodes by their name property — this is where the identifier_property that we defined in our config map is used.

Finding Communities

We can also partition the characters into communities using the label propagation algorithm:

communities = nxneo4j.community.label_propagation_communities(G)
sorted_communities = sorted(communities, key=lambda x: len(x), reverse=True)
for community in sorted_communities[:10]:
print(list(community)[:10])

['Josmyn-Peckledon', 'Belwas', 'Rafford', 'Polliver', 'Petyr-Frey', 'Tristifer-IV-Mudd', 'Jeyne-Heddle', 'Urswyck', 'Falyse-Stokeworth', 'Hoster-Blackwood']
['Trystane-Martell', 'Blue-Bard', 'Matthos-Seaworth', 'Marya-Seaworth', 'Mors-Umber', 'Jaehaerys-I-Targaryen', 'Myrcella-Baratheon', 'Justin-Massey', 'Denys-Mallister', 'Clayton-Suggs']
['Oberyn-Martell', 'Nurse', 'Tommen-Baratheon', 'Tanda-Stokeworth', 'Garlan-Tyrell', 'Morgo', 'Qavo-Nogarys', 'Moon-Boy', 'Leonette-Fossoway', 'Allar-Deem']
['Owen', 'Jon-Snow', 'Gerrick-Kingsblood', 'Lanna-(Happy-Port)', 'Maekar-I-Targaryen', 'Gorne', 'Arron', 'Arson', 'Satin', 'Rast']
['Asha-Greyjoy', 'Palla', 'Squirrel', 'Tristifer-Botley', 'Yellow-Dick', 'Lorren', 'Jason-Mallister', 'Benfred-Tallhart', 'Kyra', 'Gynir']
['Harras-Harlaw', 'Baelor-Blacktyde', 'Dunstan-Drumm', 'Ralf-Stonehouse', 'Gorold-Goodbrother', 'Rodrik-Harlaw', 'Talbert-Serry', 'Sigfryd-Harlaw', 'Rodrik-Sparr', 'Wulfe']
['Alliser-Thorne', 'Othell-Yarwyck', 'Jaremy-Rykker', 'Ragwyle', 'Craster', 'Clubfoot-Karl', 'Blane', 'Donal-Noye', 'Halder', 'Mag-Mar-Tun-Doh-Weg']
['Tomard', 'Horton-Redfort', 'Lothor-Brune', 'Myranda-Royce', 'Grisel', 'Merrett-Frey', 'Loras-Tyrell', 'Nestor-Royce', 'Anya-Waynwood', 'Marillion']
['Marq-Piper', 'Rickard-Karstark', 'Margaery-Tyrell', 'Senelle', 'Hallis-Mollen', 'Harren-Hoare', 'Nan', 'Colen-of-Greenpools', 'Desmond-Grell', 'Edmure-Tully']
['Koss', 'Woth', 'Meralyn', 'Mad-Huntsman', 'Dobber', 'Ravella-Swann', 'Ternesio-Terys', 'Yoren', 'Amabel', 'Waif']

I’ve included more information about which algorithms are available in the README of the networkx-neo4j GitHub repository. There’s also a Jupyter notebook which has examples of all the available algorithms against this dataset.

If you get the chance to play around with this and have any feedback please let me know in the Issues section of the repository or send us an email to devrel@neo4j.com.

Free download: O’Reilly “Graph Algorithms on Apache Spark and Neo4j”

--

--