Experimenting with Neo4j and Apache Zeppelin

(Neo4j)-[:LOVES]-(Zeppelin)

Andrea Santurbano
Apache Zeppelin Stories
5 min readMar 17, 2016

--

“A web-based notebook that enables interactive data analytics.
You can make beautiful data-driven, interactive and collaborative documents with SQL, Scala and
more.”
(taken from Apache Zeppelin official website)

The result of the experiment

I felt in love with Apache Zeppelin for it’s native support for Apache Spark, it allows to create powerful (big) data analytics dashboard really fast.
After some weeks spent using it with Apache Spark that “more” intrigued me, so i have tried to integrate Zeppelin with Neo4j, a graph database that i have used in an epidemiology related project.

The full notebook of this experiment is here:

To achive my goal i have used only pure javascript language directly on Zeppeling Notebook, using just one (at most two) framework(s):

  • Sigma.js: a JavaScript library dedicated to graph drawing. Sigma.js has a plugin that allows to connect to neo4j REST apis instantly;
  • Angular.js: an MVVM client-side framework. In particular for angular i use the Zeppelin’s angular display system (some deeping here)

Background: Zeppelin and Neo4j

You can get Zepplin in several ways:

  1. Building it from the source
  2. Dowloading it from official website
  3. Pulling a docker container, such as https://hub.docker.com/r/conker84/zeppelin/ or https://hub.docker.com/r/epahomov/docker-zeppelin/
  4. Build a docker container from a Dockerfile (https://github.com/conker84/docker-zeppelin)

Neo4j is a graph database developed by Neo Technology, Inc. Graph databases are well suited for datasets deeply interconnected that form a complex web difficult to represent in a relational (old) fashioned way.

You can download neo4j from official website

Step 1 (First Paragraph): Create some util functions

The first paragraph is for utility functions creation.

We found here two function:

  • loadJS: to dynamically load js using javascript Promises, simplifying the script injection phase;
  • randomColor: that creates random colors; its use will be clear afterwards.
How first paragraph will be displayed

Step 2 (Paragraph two): Create the query/visualization stage

The second paragraph contains the main code of this project.

Let’s to highlight the code.

%angular<form>
<div class=”form-group”>
<label for=”query”>Cypher Console</label>
<textarea id=”query” name=”query” class=”form-control” rows=”3" placeholder=”Write a cypher query here”></textarea>
</div>
<button class=”btn btn-success” id=”runQuery”>Run it</button>
</form>
<div id=”graph” style=”height: 600px; width: 100%”></div>

This is the form were the user will be able to write cypher queries, we have a simple form with bootstrap classes for css (Zeppelin use Bootstrap so you can).

var initSigma = function initSigmaFn () {
sigma.classes.graph.addMethod(‘neighbors’, function(nodeId) {
var k,
neighbors = {},
index = this.allNeighborsIndex[nodeId] || {};

for (k in index) {
neighbors[k] = this.nodesIndex[k];
}
return neighbors;
});
};

The initSigma function create the an util method called “neighbors” that get all neighbors from a known node.

Next we have the initGraph function that will be highlighted in several piece.

function initGraph () {
...
var endpoint = {
url: ‘http://192.168.1.119:7474',
user: ‘neo4j’,
password: ‘andrea’
},
s = new sigma(‘graph’);
var dragListener = sigma.plugins.dragNodes(s, s.renderers[0]);
...
};

This part contains the general configuration of Sigma.js, we found three objects:

  • endpoint: contains the necessary data to connect to neo4j instance;
  • s: the Sigma.js instance, where is passed “graph” as container id, so the graph is rendered in that div;
  • dragListener: the Sigma.js dragNodes plugin instance.
function initGraph () {
...
// Highlight the neighbors
s.bind(‘clickNode’, function (e) {
var nodeId = e.data.node.id,
toKeep = s.graph.neighbors(nodeId);
toKeep[nodeId] = e.data.node;

s.graph.nodes().forEach((n) => {
if (toKeep[n.id]) {
n.color = n.originalColor;
} else {
n.color = ‘#eee’;
}
});

s.graph.edges().forEach(function(e) {
if (toKeep[e.source] && toKeep[e.target]) {
e.color = e.originalColor;
} else {
e.color = ‘#eee’;
}
});

// Since the data has been modified, we need to
// call the refresh method to make the colors
// update effective.
s.refresh();
});

// When the stage is clicked, we just color each
// node and edge with its original color.
s.bind(‘clickStage’, function (e) {
s.graph.nodes().forEach((n) => {
n.color = n.originalColor;
});

s.graph.edges().forEach((e) => {
e.color = e.originalColor;
});

// Same as in the previous event:
s.refresh();
});
...
};

This part is responsible of the node neighbors de/highlighting, in particular:

  • clickNode event is for highlighting part;
  • clickStage event is for de-highlighting part;
function initGraph () {
...
var $runQuery = $('#runQuery'),
$query = $('#query'),
buttonLabel = $runQuery.text();
$runQuery.off('click').on('click', function (e) {
e.preventDefault();
e.stopImmediatePropagation();
var value = $query.val() || '';
if (!value) {
return;
}
// Before the request to Neo4j the graph area is cleaned
s.graph.clear();
// The "Run it" button and the "query" textarea
// are disabled and
// the button label has change with "Loading..."
$runQuery.text('Loading...');
$query.prop('disabled', true);
//The query has been sent to Neo4j
sigma.neo4j.cypher(
endpoint,
value,
s,
// This is a callback function called after
// the the response from Neo4j is arrived
// the 's' parameter is the sigma.js instance
// where nodes and edges have been rendered
function queryReponseCallback (s) {
// The button label is restored
$runQuery.text(buttonLabel);
// The text area is enabled
$query.prop('disabled', false);
s.graph.nodes().forEach((elem) => {
var label = elem.neo4j_labels[0] || '';
if (!label) {
return;
}
// Every node has been colored according to his Neo4j Label
elem.color = Labels[label].color;
elem.originalColor = elem.color;
});
//Refresh the sigmajs graph
s.refresh();
}
);
});
...
};

This part is responsible of passing the cypher query to neo4j server, after the nodes/edges are loaded every node will be colored by a color assigned to the relate Label (if it exists) so i have defined two properties:

  1. color: that will be used by sigma internals to color the node;
  2. originalColor: that will be used to retrive the original color when it will be hidend in neighbors highlighting phase.
function initGraph () {
...
var Labels = {};

// Calling neo4j to get all its relationship type
sigma.neo4j.getTypes(
endpoint,
function(types) {
console.log('Relationship types %s', types.join(', '));
}
);
// Calling neo4j to get all its node labels
// and assign to eachone a color
sigma.neo4j.getLabels(
endpoint,
function(labels) {
console.log('Node labels %s', labels.join(', '));
var colors = [];
labels.forEach((label) => {
var color = angular.randomColor();
while (colors.indexOf(color) > -1) {
color = angular.randomColor();
}
colors.push(color);
Labels[label] = {
color : color
};
});
}
);
...
};

The last part of initGraph function is responsible of get:

  • all relationship types (in this i will not use them);
  • all nodes labels where it is assinged one color per label.
angular.loadJS(‘https://cdnjs.cloudflare.com/ajax/libs/sigma.js/1.1.0/sigma.min.js')
.then(function () {
initSigma();
return angular.loadJS(‘https://rawgit.com/jacomyal/sigma.js/master/plugins/sigma.parsers.json/sigma.parsers.json.js');
}, initGraph)
.then(function () {
return angular.loadJS(‘https://rawgit.com/jacomyal/sigma.js/master/plugins/sigma.neo4j.cypher/sigma.neo4j.cypher.js');
})
.then(function () {
return angular.loadJS(‘https://rawgit.com/jacomyal/sigma.js/master/plugins/sigma.plugins.dragNodes/sigma.plugins.dragNodes.js');
})
.then(initGraph);

The last part of paragraph 2 consists in the loading of Sigma.js javascript dependencies.

How the second paragraph will be displayed

The next step is to build a native Zeppelin interpreter that provides more features like:

  • Support to neo4j bolt protocol instead of REST apis;
  • Cypher syntax highlighting;
  • Native graph visualization.

See you soon ;)

--

--