Property graphs and Elixir

Accessing Neo4j from Elixir with Bolt and Cypher

Photo by Alisa Anton on Unsplash

Previously in this series I have been looking at how to access RDF graphs from Elixir. But this is only one side of the graph story. There is also a whole other world of property graphs to consider.

Property graphs – also known as ‘labeled’ property graphs – are more generic than RDF graphs in that both nodes and edges may be attributed properties, and edges may be directed or undirected. That said, property graphs typically employ local names and have a weak data model and so do not support any common semantics.

See here for a nice exposition by Jesús Barrasa of the main differences between these two graph paradigms.

Unquestionably the most popular of the graph databases is Neo4j which was one of the initial movers in this field. Neo4j has been a major player in driving forward the current interest in graph databases.

A couple of key Neo4j technologies should be called out:

  • Bolt is a high-performance network protocol and was introduced with the Neo4j 3.0 release in 2016 to speed up database connections. It uses a binary encoding over TCP or web sockets and has built-in TLS support.
  • Cypher is the open declarative graph query language developed by Neo4j and now open sourced to the openCypher project. (See the Cypher Refcard for a handy quick reference.)

We’re going to make use of the bolt_sips package from Florin Pătraşcu which implements a Neo4j driver for Elixir wrapped around the Bolt protocol. (The package integrates and continues work from boltex, an independent implementation of the Bolt protocol in Elixir by Michael Schaefermeyer.)

This walkthrough will focus on a simple application for playing around with Neo4j property graphs using Elixir. (See the bolt_movies_elixir_phoenix project from Florin Pătraşcu for an example of a full-blown browser-based application using Phoenix.) We’ll look at the basics of querying for nodes, relationships and paths, and provide a basic test query library as well as project code and documentation.

We’ll also create a simple graph, see how to import other graphs and how to read graphs directly from GraphGist documents.

1. Create a ‘TestNeo4j’ project

First off, let’s create a new project TestNeo4j. And see the bolt_sips project for installation instructions.

Now, I’ve a confession to make. Because of my limited knowledge of Elixir I had some difficulty in getting this set up so that authentication details from my config file were read correctly before a Bolt connection was established. Eventually I figured out a solution for an automated startup using a supervised process. But then I learned about a much simpler manual startup from Florin Pătraşcu. Let me describe that route first, and then the supervised solution.

Manual
We create a new project TestNeo4j (in camel case) using the usual Mix build tool invocation (in snake case):

% mix new test_neo4j

We then declare a dependency on bolt_sips in the mix.exs file:

defp deps do
[
{:bolt_sips, "~> 1.3"}
]

And we use Mix to add in the dependency:

% mix deps.get

And we add these lines either to the main config.exs file or to any environment specific imports (e.g. dev.exs) with the details updated as required:

config :bolt_sips, Bolt,
url: "bolt://localhost:7687",
basic_auth: [username: "neo4j", password: "neo4jtest"]

Note that the url: option uses an explicit "bolt:" URI scheme.

Let’s also clear out the boilerplate in lib/test_neo4j.ex and add in a @moduledoc annotation:

defmodule TestNeo4j do
@moduledoc """
Top-level module used in "Property graphs in Elixir" post.
"""

end

We’ll now add this TestNeo4j.init/0 function which we can invoke to manually start up the database connection:

## database
def init() do
Application.get_env(:bolt_sips, Bolt)
|> Bolt.Sips.start_link()
  Bolt.Sips.config()
end

And also to simplify naming in IEx we’ll add in a .iex.exs configuration file so that all Bolt.Sips and TestNeo4j function names can be treated as local:

% cat .iex.exs
import Bolt.Sips
import TestNeo4j

The application can now be started manually by calling the TestNeo4j.init/0 function which establishes the Bolt connection and returns the bolt_sips configuration via a call to Bolt.Sips.config/0:

% iex -S mix
Erlang/OTP 21 [erts-10.3] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe] [dtrace]
Interactive Elixir (1.8.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> init()
[
socket: Bolt.Sips.Socket,
port: 7687,
hostname: 'localhost',
retry_linear_backoff: [delay: 150, factor: 2, tries: 3],
with_etls: false,
ssl: false,
timeout: 15000,
max_overflow: 2,
pool_size: 5,
url: "bolt://localhost:7687",
basic_auth: [username: "neo4j", password: "neo4jtest"]
]

We’re ready to go.

Supervised
Alternatively to generate an automated startup we can create a new project TestNeo4j using the usual Mix build tool invocation but with the --sup flag to set up an Application. (We’ll be using the Application to set up a supervision tree.)

% mix new test_neo4j --sup

We set up the dependency on bolt_sips and our configuration in config.exs as described above.

We now add the keyword option mod: line to our mix.exs file to automatically invoke the TestNeo4j.Application:

def application do
[
extra_applications: [:logger],
mod: { TestNeo4j.Application, [] }
]
end

And we update the TestNeo4j.Application.start/2 function in lib/test_neo4j/application.ex as:

def start(_type, _args) do
import Supervisor.Spec
children = [
worker(Bolt.Sips, [Application.get_env(:bolt_sips, Bolt)])
]
opts = [strategy: :one_for_one, name: TestNeo4j.Supervisor]
Supervisor.start_link(children, opts)
end

The application will now be started automatically and can be tested by calling the Bolt.Sips.config/0 function:

% iex -S mix
Erlang/OTP 21 [erts-10.3] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe] [dtrace]
Interactive Elixir (1.8.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> config()
[
socket: Bolt.Sips.Socket,
port: 7687,
hostname: 'localhost',
retry_linear_backoff: [delay: 150, factor: 2, tries: 3],
with_etls: false,
ssl: false,
timeout: 15000,
max_overflow: 2,
pool_size: 5,
url: "bolt://localhost:7687",
basic_auth: [username: "neo4j", password: "neo4jtest"]
]

See here for the project TestNeo4j code. (And note that the project TestNeo4j code also includes documentation which can be browsed here.)

2. Set up a Neo4j graph database

For running these experiments we will need to access a graph database in the form of a Neo4j instance.

There are two main options. We can either use a sandbox or install a local copy.

Sandbox
Follow the Developers > Sandbox link on the Neo4j site to set up a sandbox. Once logged in you can get the Bolt connection details from the Details tab and enter those into the config file. You can also set up a test account and install the Movies database with the command:

:play movies

Local copy
Use the Download Neo4j button (or follow the Products > Developer Tools link) on the Neo4j site to install Neo4j Desktop or one of the community or enterprise server editions. And again we can install the Movies database with the command:

:play movies

Both the sandbox and the local server can be queried through a browser interface although in this writeup we are rather more interested in remote access.

3. Test queries with Cypher

Now let’s look at querying with Cypher. We’ll first query for a single node and return that with its properties. Here we match on any single node (n), the parentheses in Cypher signalling that this is a node:

match (n) return n limit 1

Usefully there is a handy Mix task for sending a Cypher query from the command line mix bolt.cypher which we can invoke as:

% mix bolt.cypher "match (n) return n limit 1"
"match (n) return n limit 1"
[%{"n" => %Bolt.Sips.Types.Node{id: 0, labels: ["Movie"], properties: %{"released" => 1999, "tagline" => "Welcome to the Real World", "title" => "The Matrix"}}}]

Now let’s try this out in IEx, the Elixir shell, using the -S option to run our mix.exs script:

% iex -S mix
Erlang/OTP 21 [erts-10.2.3] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe] [dtrace]
Interactive Elixir (1.8.1) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> conn = Bolt.Sips.conn
:bolt_sips_pool
iex(2)> Bolt.Sips.query!(conn, "match (n) return n limit 1")
[
%{
"n" => %Bolt.Sips.Types.Node{
id: 0,
labels: ["Movie"],
properties: %{
"released" => 1999,
"tagline" => "Welcome to the Real World",
"title" => "The Matrix"
}
}
}
]

Note here that the query function Bolt.Sips.query!/2 requires the pool name which is used to acquire the database connection, as well as the actual Cypher querystring. This pool name is returned by the Bolt.Sips.conn/0 function.

We can also do the same to get a single relationship -[r]- with this query, where the brackets introduce a type for the relationship:

match ()-[r]-() return r limit 1

And note also that the relationship -[r]- between any two nodes () here is undirected as no arrowhead is shown.

And using the Mix task we get:

% mix bolt.cypher "match ()-[r]-() return r limit 1"
"match ()-[r]-() return r limit 1"
[%{"r" => %Bolt.Sips.Types.Relationship{end: 0, id: 7, properties: %{"roles" => ["Emil"]}, start: 8, type: "ACTED_IN"}}]

Or with IEx we get this:

iex(3)> Bolt.Sips.query!(conn, "match ()-[r]-() return r limit 1")
[
%{
"r" => %Bolt.Sips.Types.Relationship{
end: 0,
id: 7,
properties: %{"roles" => ["Emil"]},
start: 8,
type: "ACTED_IN"
}
}
]

And we can get a single path p between any two nodes () as:

match p = ()--() return p limit 1

And using the Mix task we get:

% mix bolt.cypher "match p = ()--() return p limit 1"
"match p = ()--() return p limit 1"
[%{"p" => %Bolt.Sips.Types.Path{nodes: [%Bolt.Sips.Types.Node{id: 1086, labels: ["Person"], properties: %{"born" => 1967, "name" => "Carrie-Anne Moss"}}, %Bolt.Sips.Types.Node{id: 1084, labels: ["Movie"], properties: %{"released" => 1999, "tagline" => "Welcome to the Real World", "title" => "The Matrix"}}], relationships: [%Bolt.Sips.Types.UnboundRelationship{end: nil, id: 1081, properties: %{"roles" => ["Trinity"]}, start: nil, type: "ACTED_IN"}], sequence: [1, 1]}}]

Or again with IEx we get this:

iex(4)> Bolt.Sips.query!(conn, "match p = ()--() return p limit 1")
[
%{
"p" => %Bolt.Sips.Types.Path{
nodes: [
%Bolt.Sips.Types.Node{
id: 1086,
labels: ["Person"],
properties: %{"born" => 1967, "name" => "Carrie-Anne Moss"}
},
%Bolt.Sips.Types.Node{
id: 1084,
labels: ["Movie"],
properties: %{
"released" => "1999",
"tagline" => "Welcome to the Real World",
"title" => "The Matrix"
}
}
],
relationships: [
%Bolt.Sips.Types.UnboundRelationship{
end: nil,
id: 1081,
properties: %{"roles" => ["Trinity"]},
start: nil,
type: "ACTED_IN"
}
],
sequence: [1, 1]
}
}
]

For future exploration we’ve defined some basic graph queries (TestNeo4j.nodes/0, TestNeo4j.nodes_and_relationships/0, TestNeo4j.paths/0, TestNeo4j.relationships/0) and variant forms to get a single entity (TestNeo4j.node1/0, TestNeo4j.node1_and_relationships/0, TestNeo4j.path1/0, TestNeo4j.relationship1/0). Here’s how to query for a single node:

iex(5)> conn |> node1()
[
%{
"n" => %Bolt.Sips.Types.Node{
id: 1435,
labels: ["Movie"],
properties: %{
"released" => 1999,
"tagline" => "Welcome to the Real World",
"title" => "The Matrix"
}
}
}
]

There is also a basic TestNeo4j.read_query/1 function to read from a filename in priv/queries/ (as well as a TestNeo4j.read_query/0 function with a default query).

Here’s a list of those helper functions:

# queries
read_query: 0,
read_query: 1,
node1: 1,
node1_and_relationships: 1,
nodes: 1,
nodes_and_relationships: 1,
path1: 1,
paths: 1,
relationship1: 1,
relationships: 1,

And for further documentation on the various types of graph entities supported (i.e. nodes, relationships, and paths), see Bolt.Sips.Types. (Details on how these entities are serialized in Bolt protocol messages are listed here under graph type structures.)

So, the thing works just fine. We can hook up to a Neo4j database and query it. But now we’d like to be able to create our own graphs.

4. Create a simple graph for queries

So let’s look at the RDF model we have used in previous posts for describing the book ‘Adopting Elixir’ by Ben Marx, José Valim, and Bruce Tate.

This is the simple RDF graph we developed:

@prefix bibo: <http://purl.org/ontology/bibo/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
<urn:isbn:978-1-68050-252-7> a bibo:Book ;
dc:creator <https://twitter.com/bgmarx> ;
dc:creator <https://twitter.com/josevalim> ;
dc:creator <https://twitter.com/redrapids> ;
dc:date "2018-03-14"^^xsd:date ;
dc:format "Paper" ;
dc:publisher <https://pragprog.com/> ;
dc:title "Adopting Elixir"@en .

This simple graph relates five objects, a book node, three author nodes and a publisher node. (In the RDF, a sixth node is also used for typing the book.) Note that RDF properties are of two types: datatype properties for object attributes and object properties for relationships between objetcs.

To make for a better example let’s imagine that we extend this description to include explicit RDF types for the untyped objects, as:

@prefix ex: <http://example.org/> .
<https://twitter.com/bgmarx> a ex:Author .<https://twitter.com/josevalim> a ex:Author .<https://twitter.com/redrapids> a ex:Author .
<https://pragprog.com/> a ex:Publisher .

We now want to turn this (extended) RDF description into a Cypher specification for loading the graph into a fresh database.

Nodes
We’ll create Cypher labels based on the local names for RDF types:

  • bibo:Book Book
  • ex:Author Author
  • ex:Publisher Publisher

We’ll also get some inspiration from the rdf2neo project where each node has an iri: property to record the original IRI. (Note that each node in rdf2neo also takes a default label of Resource. We’ll omit this here for clarity. It doesn’t really help.) And for additional properties we’ll just take the local names of the RDF datatype properties:

  • dc:date date
  • dc:format format
  • dc:title title

Relationships
We’ll create directed relationships from the directed RDF object properties, again taking just the local names and following common practice using verb forms instead of noun forms:

  • dc:creator AUTHORED_BY
  • dc:publisher PUBLISHED_BY

(Also we’re taking the liberty of usingAUTHORED_BY insted of CREATED_BY as being a tad more specific.)

And to show off properties on relationships, we’ll add in a role property on the AUTHORED_BY relationship to give an (admittedly not very useful) string value denoting the priority of authorship.

We can now visualize this simple graph as:

So now we can express this Books graph in Cypher as:

//
// create nodes
//
CREATE
(book:Book {
iri: "urn:isbn:978-1-68050-252-7",
date: "2018-03-14",
format: "Paper",
title: "Adopting Elixir"
}),
(author1:Author { iri: "https://twitter.com/bgmarx" }),
(author2:Author { iri: "https://twitter.com/josevalim" }),
(author3:Author { iri: "https://twitter.com/redrapids" }),
(publisher:Publisher { iri: "https://pragprog.com/" })
//
// create relationships
//
CREATE
(book)-[:AUTHORED_BY { role: "first author" }]->(author1),
(book)-[:AUTHORED_BY { role: "second author" }]->(author2),
(book)-[:AUTHORED_BY { role: "third author" }]->(author3),
(book)-[:PUBLISHED_BY]->(publisher)
;

Here book, author1, author2, author3, and publisher are just variables being used for cross-refrencing in the Cypher query.

5. Query a simple graph

Now let’s create a new database and import the Books graph we just created. The test_neo4j app provides a basic TestNeo4j.books/0 function for returning the Cypher query for this graph. So this can be imported using the Bolt.Sips.query!/2 function, where the first argument is the Bolt pool name used to acquire the database connection, and the second argument is the Cypher querystring:

iex(1)> books()
"//\n// create nodes\n//\nCREATE\n(book:Book {\n iri: \"urn:isbn:978-1-68050-252-7\",\n date: \"2018-03-14\",\n format: \"Paper\",\n title: \"Adopting Elixir\"\n}),\n(author1:Author { iri: \"https://twitter.com/bgmarx\" }),\n(author2:Author { iri: \"https://twitter.com/josevalim\" }),\n(author3:Author { iri: \"https://twitter.com/redrapids\" }),\n(publisher:Publisher { iri: \"https://pragprog.com/\" })\n\n//\n// create relationships\n//\nCREATE\n(book)-[:AUTHORED_BY { role: \"first author\" }]->(author1),\n(book)-[:AUTHORED_BY { role: \"second author\" }]->(author2),\n(book)-[:AUTHORED_BY { role: \"third author\" }]->(author3),\n(book)-[:PUBLISHED_BY]->(publisher)\n\n;\n"
iex(2)> conn |> query!(books())
%{
stats: %{
"labels-added" => 5,
"nodes-created" => 5,
"properties-set" => 11,
"relationships-created" => 4
},
type: "w"
}

This confirms that five labeled nodes and four relationships have been created together with 11 properties.

We can now test this graph by querying over all the nodes and inspecting each node:

iex(3)> conn |> query!("match (n) return n") |> Enum.each(fn item -> IO.inspect(item["n"]) end)
%Bolt.Sips.Types.Node{
id: 1044,
labels: ["Book"],
properties: %{
"date" => "2018-03-14",
"format" => "Paper",
"iri" => "urn:isbn:978-1-68050-252-7",
"title" => "Adopting Elixir"
}
}
%Bolt.Sips.Types.Node{
id: 1045,
labels: ["Author"],
properties: %{"iri" => "https://twitter.com/bgmarx"}
}
%Bolt.Sips.Types.Node{
id: 1046,
labels: ["Author"],
properties: %{"iri" => "https://twitter.com/josevalim"}
}
%Bolt.Sips.Types.Node{
id: 1047,
labels: ["Author"],
properties: %{"iri" => "https://twitter.com/redrapids"}
}
%Bolt.Sips.Types.Node{
id: 1048,
labels: ["Publisher"],
properties: %{"iri" => "https://pragprog.com/"}
}
:ok

And likewise we can query over all the relationships and inspect each relationship:

iex(4)> conn |> query!("match ()-[r]->() return r") |> Enum.each(fn item-> IO.inspect(item["r"]) end)
%Bolt.Sips.Types.Relationship{
end: 1045,
id: 1040,
properties: %{"role" => "first author"},
start: 1044,
type: "AUTHORED_BY"
}
%Bolt.Sips.Types.Relationship{
end: 1046,
id: 1041,
properties: %{"role" => "second author"},
start: 1044,
type: "AUTHORED_BY"
}
%Bolt.Sips.Types.Relationship{
end: 1047,
id: 1042,
properties: %{"role" => "third author"},
start: 1044,
type: "AUTHORED_BY"
}
%Bolt.Sips.Types.Relationship{
end: 1048,
id: 1043,
properties: %{},
start: 1044,
type: "PUBLISHED_BY"
}
:ok

We can also query for properties on relationships, testing for existence, and printing those property values out:

iex(5)> conn |> query!("match (n)-[r]->() where exists(r.role) return r.role") |> Enum.each(fn item -> IO.puts(item["r.role"]) end)
first author
third author
second author
:ok

Now let’s try querying over paths. Here we look for the path between a node with a :Book label and a node with an :Author label with an :iri property of "https://twitter.com/redrapids":

iex(6)> conn |> query!("match p = (:Book)--(:Author {iri: 'https://twitter.com/redrapids'}) return p")) |> Enum.each(fn item -> IO.inspect(item["p"]) end)
%Bolt.Sips.Types.Path{
nodes: [
%Bolt.Sips.Types.Node{
id: 1044,
labels: ["Book"],
properties: %{
"date" => "2018-03-14",
"format" => "Paper",
"iri" => "urn:isbn:978-1-68050-252-7",
"title" => "Adopting Elixir"
}
},
%Bolt.Sips.Types.Node{
id: 1047,
labels: ["Author"],
properties: %{"iri" => "https://twitter.com/redrapids"}
}
],
relationships: [
%Bolt.Sips.Types.UnboundRelationship{
end: nil,
id: 1042,
properties: %{"role" => "third author"},
start: nil,
type: "AUTHORED_BY"
}
],
sequence: [1, 1]
}
:ok

Now, if we want to find the longest (undirected) path in our graph we could try listing all unique path lengths:

iex(7)> conn |> query!("match p = ()-[*]-() return distinct length(p)")
[%{"length(p)" => 1}, %{"length(p)" => 2}]

Or we could just get the longest path length directly as:

iex(8)> conn |> query!("match p = ()-[*]-() return distinct length(p) order by length(p) desc limit 1")
[%{"length(p)" => 2}]

So, let’s see an example of one of those paths of length 2:

iex(9)> conn |> query!("match p = ()-[*]-() where length(p) =2 return p, length(p) limit 1")
[
%{
"length(p)" => 2,
"p" => %Bolt.Sips.Types.Path{
nodes: [
%Bolt.Sips.Types.Node{
id: 1046,
labels: ["Author"],
properties: %{"iri" => "https://twitter.com/josevalim"}
},
%Bolt.Sips.Types.Node{
id: 1044,
labels: ["Book"],
properties: %{
"date" => "2018-03-14",
"format" => "Paper",
"iri" => "urn:isbn:978-1-68050-252-7",
"title" => "Adopting Elixir"
}
},
%Bolt.Sips.Types.Node{
id: 1045,
labels: ["Author"],
properties: %{"iri" => "https://twitter.com/bgmarx"}
}
],
relationships: [
%Bolt.Sips.Types.UnboundRelationship{
end: nil,
id: 1041,
properties: %{"role" => "second author"},
start: nil,
type: "AUTHORED_BY"
},
%Bolt.Sips.Types.UnboundRelationship{
end: nil,
id: 1040,
properties: %{"role" => "first author"},
start: nil,
type: "AUTHORED_BY"
}
],
sequence: [-1, 1, 2, 2]
}
}

As noted, the Books graph is a very simple graph so for more expressive queries we’ll need to upgrade to a more interesting graph.

6. Back to the Movies graph

So we’ve added the Books graph but would now like to replace this with another graph, the Movies graph say. As with the earlier Books graph the test_neo4j app provides a basic TestNeo4j.movies/0 function for returning the Cypher query for the Movies graph, the same query that is generated in the :play movies command from before.

We’ve got a couple useful little functions to help out with this: TestNeo4j.clear/1 for removing all nodes and relationships from the active database, and TestNeo4j.test/1 for testing the counts of nodes, paths, and relationships in the active database. (The TestNeo4j.reset/1 function is just an alias for TestNeo4j.clear/1. And we also include here the previously discussed TestNeo4j.init/0 function for a manual startup.)

There is also a basic TestNeo4j.read_graph/1 function to read from a filename in priv/graphs/ (as well as a TestNeo4j.read_graph/0 function with a default graph).

Here’s a list of those helper functions:

# database
clear: 1,
init: 0,
reset: 1,
test: 1,
# graphs
books: 0,
movies: 0,
read_graph: 0,
read_graph: 1,

So, let’s see where we are and clear this database. (Note that this does not reset the index values that are assigned but we should regard these anyway as implementation details.)

iex(1)> conn |> test()
[%{"nodes" => 5, "paths" => 8, "relationships" => 4}]
iex(2)> conn |> clear()
%{stats: %{"nodes-deleted" => 5, "relationships-deleted" => 4}, type: "w"}
iex(3)> conn |> test()
[%{"nodes" => 0, "paths" => 0, "relationships" => 0}]

Note that the paths count refers to one-hop paths.

Now let’s populate this fresh database with the Movies graph using the TestNeo4j.movies/0 function:

iex(4)> movies()
"CREATE (TheMatrix:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'})\nCREATE (Keanu:Person {name:'Keanu Reeves', born:1964})\nCREATE (Carrie:Person {name:'Carrie-Anne Moss', born:1967})\nCREATE (Laurence:Person {name:'Laurence Fishburne', born:1961})\nCREATE (Hugo:Person {name:'Hugo Weaving', born:1960})\nCREATE (LillyW:Person {name:'Lilly Wachowski', born:1967})\nCREATE (LanaW:Person {name:'Lana Wachowski', born:1965})\nCREATE (JoelS:Person {name:'Joel Silver', born:1952})\nCREATE\n (Keanu)-[:ACTED_IN {roles:['Neo']}]->(TheMatrix),\n (Carrie)-[:ACTED_IN {roles:['Trinity']}]->(TheMatrix),\n (Laurence)-[:ACTED_IN {roles:['Morpheus']}]->(TheMatrix),\n (Hugo)-[:ACTED_IN {roles:['Agent Smith']}]->(TheMatrix),\n " <> ...
iex(5)> conn |> query!(movies())
%{
stats: %{
"labels-added" => 171,
"nodes-created" => 171,
"properties-set" => 564,
"relationships-created" => 253
},
type: "w"
}
iex(6)> conn |> test()
[%{"nodes" => 171, "paths" => 506,"relationships" => 253}]

Note that the TestNeo4j.test/1 function agrees with the Bolt.Sips.query!/2 function return.

We can also try clearing the database and reimporting the graph:

iex(7)> conn |> clear()
%{stats: %{"nodes-deleted" => 171, "relationships-deleted" => 253}, type: "w"}
iex(8)> conn |> test()
[%{"nodes" => 0, "paths" => 0, "relationships" => 0}]
iex(9)> conn |> query!(movies())
%{
stats: %{ ... },
type: "w"
}
iex(10)> conn |> test()
[%{"nodes" => 171, "paths" => 506, "relationships" => 253}]

And now we can get down to the business of querying over this more elaborate graph:

iex(11)> conn |> query!("match (n:Movie) return n.title order by n.title")
[
%{"n.title" => "A Few Good Men"},
%{"n.title" => "A League of Their Own"},
%{"n.title" => "Apollo 13"},
%{"n.title" => "As Good as It Gets"},
%{"n.title" => "Bicentennial Man"},
%{"n.title" => "Cast Away"},
...
%{"n.title" => "Twister"},
%{"n.title" => "Unforgiven"},
%{"n.title" => "V for Vendetta"},
%{"n.title" => "What Dreams May Come"},
%{"n.title" => "When Harry Met Sally"},
%{"n.title" => "You've Got Mail"}
]

And, of course, we can replace this graph with other graphs of interest by clearing the database and querying with a Cypher querystring for creating graphs.

7. Read from a GraphGist

Neo4j also supports a teaching tool format called GraphGists. These are simple text-based documents based on the AsciiDoc format which also support GraphGist directives through special comments.

The intention is to build up a library of user-supplied graph models which can be displayed through the GraphGist application and displayed in a browser for instructional purposes. See the GraphGists page for library examples, or see also the neo4j-examples/graphgists project in GitHub.

Now the GraphGist format includes a setup section for graphs using a Cypher query. We can simply parse a GraphGist document for this query and play with this in our testing. The test_neo4j app provides a simple function TestNeo4j.parse/1 to regex this query string. This is just a very quick and dirty hack using an ungreedy, multiline regex roughly over the following pattern:

//setup
//hide
[source,cypher]
----
CREATE ...
----

This has been tested against the neo4j-examples/graphgists project and works on a majority of these examples, failing only where the example markup is not well behaved.

There is also a basic TestNeo4j.read_graphgist/1 function to read a GraphGist from a filename in priv/graphgists/ (as well as a TestNeo4j.read_graphgist/0 function with a default GraphGist).

Here’s a list of those helper functions:

# graphgists
parse: 1,
read_graphgist: 0,
read_graphgist: 1,

Here we add in the Flight Analyzer model from Soheil Jazayeri (see source for the GraphGist). We clear the graph database and check that it is empty, extract the Cypher query from the GraphGist, and then update the database using this query, which returns some basic stats:

iex(2)> conn |> clear()
%{stats: %{"nodes-deleted" => 13, "relationships-deleted" => 14}, type: "w"}
iex(3)> conn |> test()
[%{"nodes" => 0, "paths" => 0, "relationships" => 0}]
iex(4)> cypher = parse(read_graphgist("flight-analyzer.adoc"))
"CREATE (SEA:Airport { name:'SEA' }),(EWR:Airport { name:'EWR' }),(IAH:Airport { name:'IAH' }),(SAT:Airport { name:'SAT' }),(MCO:Airport { name:'MCO' }),(LAX:Airport { name:'LAX' }),(FLL:Airport { name:'FLL' }),(ORD:Airport { name:'ORD' }),(MSP:Airport { name:'MSP' }),(KOA:Airport { name:'KOA' }),(PDX:Airport { name:'PDX' }),(SFO:Airport { name:'SFO' }),
<> ... <> "(f0:Flight { date:'11/30/2015 04:24:12', duration:218, distance:1721, airline:'19977' }),(f0)-[:ORIGIN]->(SEA),(f0)-[:DESTINATION]->(ORD),(t1f0:Ticket { class:'economy', price:1344.75 }),(t1f0)-[:ASSIGN]->(f0),(t2f0:Ticket { class:'business', price:1793 }),(t2f0)-[:ASSIGN]->(f0),(t3f0:Ticket { class:'firstClass', price:2151.6 }),(t3f0)-[:ASSIGN]->(f0)," <> ...
iex(5)> conn |> query!(cypher)
%{
stats: %{
"labels-added" => 1287,
"nodes-created" => 1287,
"properties-set" => 3111,
"relationships-created" => 1520
},
type: "w"
}

We can now run a query over the graph, here just listing out the distinct labels:

iex(6)> conn |> query!("match (n) return distinct labels(n)")
[
%{"labels(n)" => ["Airport"]},
%{"labels(n)" => ["Flight"]},
%{"labels(n)" => ["Ticket"]}
]

And now we can try out some more interesting graphs and queries.

Summary

I’ve shown here in this post how to access Neo4j graph databases with Elixir using the bolt_sips package. We have queried the graphs using the Cypher query language.

Beyond that I’ve shown how to create a simple graph (modified from an RDF graph) and how to import graphs and to parse out graphs from GraphGists. There seems to be plenty of scope for using Elixir to process property graph data.

And just as we have seen how Elixir can be used to query Neo4j property graphs with Cypher, in previous posts we also saw how Elixir can also be used to query RDF graphs with SPARQL. It is rather intriguing to consider what role Elixir could play in moving data between these two graph worlds.

Next post in this series


See here for the project TestNeo4j code. (And note that the project TestNeo4j code also includes documentation which can be browsed here.)

This is the sixth in a series of posts. See my previous post ‘Jupyter Notebooks with Elixir and RDF’.

You can also follow me on Twitter as @tonyhammond.