Week 35 — Discover AuraDB Free — NODES 2022 Sessions

Michael Hunger

Published in

Neo4j Developer Blog

8 min readSep 27, 2022

We’re really excited and looking forward to our annual online developer conference NODES 2022 on November 16 and 17.

The event will run around the globe with talks in all time zones.

You can save your spot now by registering early for some of the cool goodies that are coming your way.

We ran our Call for Papers in August and got 150 really great submissions by 130 speakers.

We selected 90 for the event, which was really not easy. Some of the sessions that didn’t make it for the conference will be run in later live streams.

Today, let’s look at the NODES data model and import the data for sessions and speakers into Neo4j from CSV.

If you’d rather watch the video of the live stream, here you go:

Next time we’ll import the data from a REST API, and then look at the schedule and session recommendations.

Let’s start by creating and connecting to our AuraDB Free instance.

If you want to try out the Workspace Beta too and provide feedback, please go to https://neo4j.com/product/workspace and sign up.

Create a Neo4j AuraDB Free Instance

Go to https://dev.neo4j.com/neo4j-aura to register or log into the service (you might need to verify your email address).

After clicking Create Database you can create a new Neo4j AuraDB Free instance. Select a Region close to you and give it a name, e.g. NODES Sessions.

Choose the “blank database” option, as we want to import our data ourselves.

On the credentials pop-up page, make sure you save the password somewhere safe. It’s best to download the credentials file, which you can also use for your app development.

The default username is always neo4j.

Then wait two to three minutes for your instance to be created.

Afterwards, you can connect via the Query Button with Neo4j Browser (you’ll need the password), or click Import for the Data Importer and Explore for Neo4j Bloom.

On the database tile you can also find the connection URL: neo4j+s://xxx.databases.neo4j.io (also contained in your credentials env file).

If you want to see examples for programmatically connecting to the database, go to the “Connect” tab of your instance and pick the language of your choice.

Source Data from Sessionize

Sessionize offers both an JSON REST API to fetch the data, as well as export as CSV/XLS.

We saved the CSV for sessions and speakers locally and can use it with Data Importer in Workspace.

The fields in our Session CSV are:

Session Id
Title
Description
Speakers
Session format
Level
Prerequisites for attendees?
Topic of your presentation
Neo4j use case
Your time zone
Status
Date submitted
Speaker Ids

The fields for our Speakers:

Speaker Id
FirstName
LastName
TagLine
Bio
timezone
City
Country
Ninja
LinkedIn
Blog
Twitter
Company Website
Profile Picture

Data Modeling and Import

TL;DR

If you want to shortcut the modeling you can grab the nodes-sessions-data-importer-2022-09-26.zip file from the GitHub repository and load it via the "Open model (with data)" menu entry in the three dots ….

If we open Workspace on our blank database (or Data Importer directly) we can add our CSV files on the left side.

Then we start mapping our data by:

adding the session node
setting its label to session
selecting the sessions CSV in the mapping view
and adding all relevant fields to the mapping from the file
the Session Id is automatically selected as id field

Then we do the same for the Speaker node, just with the speaker’s CSV.

To connect both, we drag out a relationship from speaker to session from the speaker node’s halo.

Give it the name PRESENTS and use the fields Session Id and Speaker Ids for the mapping.

Unfortunately, the comma separated Speaker Ids are not handled by data importer yet, so this will only connect sessions which have a single speaker.

But fear not — we will connect the rest with a bit of post-processing.

In Preview we see how our data will look in the graph, both the properties and relationships.

If we’re satisfied, we can click “Run Import” and it will take roughly a second to import the data.

If you click on the Show query links in the import report you’ll see the constraints and import query that data importer is running.

There you also see that for the relationships it tries to match the existing nodes for session and speaker based on Sessiond Id and Speaker Ids to connect them, which only works if there’s a single speaker for a session.

After running the importer we can open the “Query” tab by htting “Run Queries.”

Or just click on it on top.

Querying

A single pre-populated query shows us our graph, which should look very similar to the preview.

MATCH p=()-[:PRESENTS]->() RETURN p LIMIT 25;

We can also look for speakers that have more than one session, like Anton, but not yet for sessions with more than one speaker.

Let’s first find sessions that have no speakers yet.

match (s:Session) where not exists { (s)<-[:PRESENTS]-() }
return s.`Session Id` as session, s.`Speaker Ids` as speakers

These are the ones we want to fix in our post processing.

Post Processing

The approach we take for all these operations is the same:

Find the sessions to update, split a field by comma+space `, ` into a list of values. Then turn (UNWIND) this list of values into rows of values, MATCH or MERGE (get-or-create) nodes for the values and connect the session to them via a relationship.

List of speaker ids

MATCH (s:Session) WHERE NOT EXISTS { (s)<-[:PRESENTS]-() }
RETURN s.`Session Id` as session, split(s.`Speaker Ids`,', ') as speakers

List to rows

MATCH (s:Session) WHERE NOT EXISTS { (s)<-[:PRESENTS]-() }
WITH s, split(s.`Speaker Ids`,', ') as speakers
UNWIND speakers as speakerId
RETURN s.`Session Id` as session, speakerId

MATCH speakers and connect them

MATCH (s:Session) WHERE NOT EXISTS { (s)<-[:PRESENTS]-() }
WITH  s, split(s.`Speaker Ids`,', ') as speakers
UNWIND speakers as speakerId
MATCH (sp:Speaker {`Speaker Id`:speakerId})
MERGE (sp)-[r:PRESENTS]->(s)
RETURN *

Delete orphan speakers

As sessionize exported all speakers not just the ones with the accepted sessions, we now how to remove our orphans.

MATCH (sp:Speaker) 
WHERE NOT EXISTS { (sp)-[:PRESENTS]->() }
DELETE sp

Categorize other fields

We can now do the same for the other comma separated fields.

Level
Topic
Neo4j Use case

Here we generally call the list names, and value name and the node n to keep the editing needed to a minimum.

Level

MATCH (s:Session)
WITH  s, split(s.Level,', ') as names
UNWIND names as name
MERGE (n:Level {name:name})
MERGE (s)-[r:OF_LEVEL]->(n)
RETURN *

Use Cases

MATCH (s:Session)
WITH  s, split(s.`Neo4j Use-Case`,', ') as names
UNWIND names as name
MERGE (n:UseCase {name:name})
MERGE (s)-[r:USECASE]->(n)
RETURN *

Topics

match (s:Session)
WITH  s, split(s.`Topic of your presentation`,', ') as names
UNWIND names as name
MERGE (n:Topic {name:name})
MERGE (s)-[r:HAS_TOPIC]->(n)
RETURN *

Now we have a beautiful graph with different nodes for:

Session
Speaker
Level
Topic
Use Case

and their relationships.

Exploring the Results

We can open “Explore” to visualize our data a bit.

In case the “Perspective” is not really showing something, click on the “Untitled Perspective,” choose delete from the three dots, and then create/generate a new one.

First, we set the correct captions for Session (Title) and Speaker (FirstName and LastName) as well as some icons.

Then we can pick Show me A Graph from the drop-down to quickly show a graph and explore it a bit.

Next we can use our extracted topics and see which sessions share topics by entering: Session<tab>Topic<tab>Session<tab> into the search bar and hitting return.

Dashboard

We’re using NeoDash, a Neo4j Labs project for creating quick dashboards.

You can open it via https://tools.neo4jlabs.com and add your connection URL (from the aura console or your credentials download) to the form and click on the Open button for NeoDash.

You still need to provide the password — for security reasons it shouldn’t be passed through the URL.

In the video we go through the charts and queries for the dashboard. We’ll list them quickly here.

For the speaker locations we first have to compute the geolocation from their city and country.

Add geolocation for Speaker Cities

Let’s check it for a single speaker. We need to see if the city is actually empty or has some characters.

Then we can call apoc.spatial.geocodeOnce to geocode the city and country and look at the results.

match (sp:Speaker)
where size(sp.City) > 1
with sp limit 1
call apoc.spatial.geocodeOnce(sp.City+" "+sp.Country) yield location, data, latitude, longitude
return *

We can use the latitude and longitude from the result directly to create spatial point location properties in our speakers.

match (sp:Speaker)
where size(sp.City) > 1
call apoc.spatial.geocodeOnce(sp.City+" "+sp.Country) yield location, data, latitude, longitude
set sp.location = point({latitude:latitude, longitude:longitude})

Which we then can put on a Map-Chart in the Dashboard by just selecting the speakers that have a location property.

Conclusion

We hope this was a fun and useful sessions and you got excited for NODES 2022.

Stay tuned for the next time when we look at importing REST APIs, Schedule Modeling, and computing recommendations.