Visualize Semi-Structured Data Using Neo4j

Visualizing the Marvel and DC Universe Characters in a Graph Database

Graph databases are powerful structures to visualize connections between entities that miss a structured format in them. This tutorial is a step by step approach for generating simple graphs using Neo4j.

Goal

We will see how to use Neo4j to set up a quick query engine for generating graphs from a JSON document and also capture a structured format from the generated graph.

Data

The data is going to be a custom JSON file that contains information about the characters (heroes, villains, anti-heroes) from the DC and Marvel Universe.

Prerequisite

You should have installed the latest version of the Neo4j Desktop in your machine. If not, you can install it from here (depends on the type of OS you’re using).

I will be using the Neo4j Desktop v1.3.4.

STEP 0: Upon successful installation, you will see the following window when you start Neo4j Desktop.

Step 0: Setting the Database Connection and Plugin Setup

STEP 1: Select the +Add Database option. Create a new local database. Give your Graph database a name, and a password that you will use for connecting with the database later. Click Create.

STEP 1: Creating your first Graph Database :)

STEP 2: Hit the ▶️ Start button. This will generate an empty graph with zero nodes and zero relationships. It is an empty graph upon which we will be building soon.

STEP 3: You will see three dots, on the Database tile. You should click the Manage gear icon ⚙️ and this will take you to the monitoring page of the graph database.

STEP 3: Click on the Manage Gear Icon

STEP 4: Navigate to the Plugins tab, and hit the install button on the APOC (Awesome Procedures On Cypher) library section. This library is used to import different file formats to Neo4j and leverage that to a graph representation.

STEP 5: Navigate to the Settings tab, you will find a script file in which you need to add two permission statements:

apoc.import.file.enabled=true
apoc.export.file.enabled=true
Setup Snapshot in the Neo4j Browser
Setup Snapshot in the Neo4j Browser
STEP 5: Activating the Import/Export capability of APOC package

Click the Apply button. This completes the setup, now its time to code. Before that, you deserve a cookie 🍪!

Good job! Let’s begin :)

We will use a JSON file to generate the nodes and relationships we need for the superheroes and villains from Marvel and DC. You can copy the below content and paste it in your code editor:

Data for the Graph Database

To place the file, go to the Manage tab from the home page (Step 5) and select the Open folder - Import option and this will open the import directory where you can place the JSON file for this tutorial. After all the configuration setup, you can start the database and open the Neo4j browser.

A quick note about the data, it contains the squad/anti-squad of heroes and anti-heroes from the Marvel and DC Universe with suitable properties assigned to each character. We can create a mapping between them as who all acted in the same movie to start with as a query in Neo4j.

Let’s Code 👨‍💻

Graphs, in general, have two main characteristics: Nodes, and Relationships.

Nodes can be considered as the basic entity like an object with a label and some properties (metadata) attached to it. Multiple nodes are connected with others by a relationship with a label and properties (metadata) too. Multiple nodes and relationships together make the graph database.

The relationship can have a sense of direction indicated by an arrow. For example, we can denote that Iron Man hated Thanos using the following code

Simple CREATE query in Cypher

To run the code, you can use Ctrl+Enter or Cmd+Enter.

The general syntax for creating a node-to-node relation is as follows:

(node1:Label{key:value,..})-[relationship:Label]->(node2:Label{key:value,..})

To visualize a graph run: (remember this command)

MATCH (n) RETURN (n)
Result of a simple CREATE query

Following this, we can add more nodes and relationships to the existing graph to make an exciting graph from it. Let’s use the JSON data for this.

The filedata.json can be read using the following Cypher query which is the standard querying language in Neo4j.

Once we are able to get the JSON file to the Graph database, we can now convert that to a tabular structure, create relationships between various levels of the structure in the data.

Query to create the Nodes

Lines 1 and 2 are used to load the JSON data into Neo4j and store it in a variable value .This is then unwrapped using the clauseUNWIND in Line 3. So, we have the entire Marvel child-dictionary in the variable v . We then store the information of Heroes&Villains and Anti-Heroes in two variables, hv and ahv respectively in Line 4.

Lines 5 through 7 are used to parse through each list of characters and create three classes of nodes {HERO, VILLAIN, ANTIHERO} which have an attached property to them like name, superpower, otherside, quotes, and the universe.

The FOREACH clause is used to get each item from the list of characters and generate a Node. The | character is used to pass the result from one clause to the next, here the result from FOREACH is fed to MERGE clause. This clause will create the nodes just like the combination of CREATE clause and MATCH clause. It will avoid the creation of duplicate nodes by checking the database for nodes with the same label. If it exists it will not create the duplicate one.

Running this command we can get the graph generated for the heroes, villains, and anti-heroes.

MATCH (h:HERO), (v:VILLAIN), (ahv:ANTIHERO)
RETURN h, v, ahv

When we visualize the nodes, we get a simple graph

Graph of Marvel Characters

Now, we have done the crucial part of creating the Nodes in the database, we should connect each of them with a relationship. This can only be a directed relationship. Neo4j does not support removing directions while creating a relationship, although we can avoid it while querying.

This is done by using the MATCH and CREATE clauses. The direction is shown from left to right in Line 11 which can be read as

Spiderman is the god-son of Ironman.

Line 12, we create a right to left arrow from Ironman to Spiderman which is the equivalent of

Ironman is the god-father of Spiderman.

We can also create relationships on a node that points to itself like in Line 7 and Line 13.

Query to generate the relationships
Final Graph with Relationships (Heroes vs. Villains)

If you notice, we have left the DC universe from this graph, let this be an exercise for you to work on a simple graph using the data.json file and create interesting relationships from scratch. You can try building something similar or a more complicated graph using what we have seen in this tutorial.

Maybe this is complicated :)

Conclusion

Congratulations! You have learned how to use the basic Cypher query clauses to create graphs from a JSON file. I hope you had fun programming using Neo4j.

As food for thought. One way, this can be leveraged is by mining API responses in the form of a JSON and converting them to nodes and relationships.

The data and queries used in this tutorial can be found in this repository.

Feel free to experiment with the data, add more nodes, relations, and extend the tree!

I am planning to write more articles in the coming days about general programming concepts, data engineering, machine learning, basic data science concepts, and other analytic methodologies.

Feel free to connect with me on LinkedIn and GitHub.

This is my first article on Medium, your constructive comments and thoughts will help me in writing better articles. Feel free to reach out to me!

Get smarter at building your thing. Join The Startup’s +787K followers.

By The Startup

Get smarter at building your thing. Subscribe to receive The Startup's top 10 most read stories — delivered straight into your inbox, once a week. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Lingeshwaran Kanniappan

Written by

||~Engineer~Analyst~||

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +787K followers.

Lingeshwaran Kanniappan

Written by

||~Engineer~Analyst~||

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +787K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface.

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox.

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic.

Get the Medium app