An introduction to using Keras with Neo4j

David Mack
Feb 15, 2018 · 8 min read

We demonstrate connecting a Neo4j graph database to Keras. We build a neural network achieving 100% test accuracy on a simple review prediction task.

Graph databases are a powerful way to store and analyse data. Often the relationships between things, for example between people, are as important as the properties of those things themselves. In a graph database it’s easy to store and analyse those relationships.

Thanks to their focus on relationships, graph databases enable many sorts of algorithms that are difficult to perform on a traditional SQL database. For example, finding the shortest path along roads between cities can be done trivially on a graph database but may be impossible on a SQL database.

Machine learning (‘ML’) is a powerful technology that’s being rapidly adopted by technology teams around the world. It’s able to solve problems that are very difficult to solve with hand-written code; for example, it was possible with hand-written code to beat a grand-master in chess in 1997, but it required the advent of machine learning for a computer to beat a Go professional in 2015.

Building a graph-native ML system (‘graph ML’) has numerous benefits. Firstly, it allows the learning system to explore more of your data. Traditional learning systems train on a single table prepared by the researcher, whereas a graph-native system can access more than just that table. Secondly, graph ML can analyse the relationships between entities as well as their properties. This brings an additional dimension of information that graph ML can harness for prediction and categorisation.

There are not many resources on how to build graph-native machine learning systems. With Andrew Jefferson I’ve been researching different approaches. Over the upcoming months we’ll be sharing what we’ve found.

In this article I’ll demonstrate a very basic graph ML system, that can solve a simple prediction challenge. Whilst this example barely scratches the surface of what is possible, it’s a good launchpad to introduce the technologies involved.

We’re going to be looking at product reviews. In our world there are people, who write reviews of products. Here’s what this looks like in a graph:

In a graph database we can query information based on patterns. Neo4j, the database we’ll use here, uses a query language called Cypher. The above graph was generated by a simple query:


This looks for a node, of label PERSON, with a relationship of label WROTE, to a node of label REVIEW, with a relationship of label OF to a node of label PRODUCT. The qualifier “LIMIT 1” asks the database to just return one instance that matches this pattern.

Neo4j implements a property graph model, in which nodes and relationships can have properties. This is a really flexible model allowing us to conveniently put data where we want.

In our example graph, reviews have a score property between 0 and 1:

review = {
"score": 1,
"id": "7409a120-c85e-4297-90a7-7a2e04cc5a43",

We’ve generated a very simple product review graph. It’s a synthetic dataset, generated from a probabilistic model we’ve designed for the purpose of learning about graph ML.

The challenge is to predict what review score a person will give to a product.

In our graph we have the following nodes with properties:

  • Person, with a style_preference vector of width 6 that one-hot encodes which product style they like
  • Product, with a style vector of width 6 that one-hot encodes which style that product is
  • Review, between person and product, with a score floating point number

Note on style and style_preference: These vectors are provided as one-hot encodings to keep the machine-learning model very simple. In a more realistic graph database these might be encoded as relationships.

Review score are calculated as the dot product between a person’s style_preference and a product’s style.

For example, Jane with style_preference [0,1,0,0,0,0] reviews product Nintendo Switch with style [0,1,0,0,0,0], and her review score is 1.0. When she reviews a PS4, with style [1,0,0,0,0,0], her review score is 0.0.

Building your own graph ML system

You can download the code for this article from our public github:

git clone

We’ve written this system in Python. Python is a popular choice amongst data-scientists and AI researchers because it has many useful data analysis libraries including Tensorflow, SciPy and Numpy. We’re going to use Keras for machine learning because it makes it very easy to write and train our simple network.

Let’s install our dependencies using pipenv:

$ cd article-0/
$ pip install pipenv
$ pipenv install
$ pipenv shell
(article-0) $

We’ve hosted a Neo4j database for you with the dataset already loaded. When you run ./ it will use our hosted database by default.

If you want to host the data locally, install Neo4j, and use our generate-data repository to generate dataset article-0 then update settings.json to point at your database.

You can inspect the data for yourself. Open the Neo4j browser with this username and password , then try the query MATCH g = (person) -[:WROTE]-> (review) -[:OF]-> (product) RETURN g LIMIT 100 :

The next step is to perform machine learning on the data in our graph database.

We’re going to use Keras for this (part of the Tensorflow project). Keras is a high level wrapper around deep learning frameworks like Tensorflow. Keras lets you build and train a model with just a few lines of code, as the library takes care of much of the repetitive boilerplate.

The general scheme for our learning system is to pass in each (:PERSON)-->(:REVIEW)-->(:PRODUCT) path as a concatenated [, person.style_preference] vector and ask the system to predict the review.score

We’re going to build a two-hidden layer model — the input data is passed through two fully connected layers, then passed through a single node output layer. For a comprehensive introduction to feed-forward neural networks see some of the great tutorials online.

Let’s start putting together the code to query the database, so we can feed the data into Keras.

Keras provides a Sequence class for inputting data into the framework. We’ll create our own sub-class that is able to consume data from the Neo4j database.

Here’s the cypher query we’ll use to get the input for the learning system:

MATCH p=    (person:PERSON)    -[:WROTE]->    (review:REVIEW {dataset_name:{dataset_name}, test:{test}})    -[:OF]->    (product:PRODUCT)RETURN person.style_preference + as x, review.score as y

As described earlier, we’re matching every (:PERSON)-->(:REVIEW)-->(:PRODUCT) path. We then concatenate + person.style_preference to form a vector, the x value (model input), and use review.score as the y value (prediction target) .

We’ve two query parameters: dataset_name (which is ‘article_0’ for this experiment, to allow you to have multiple experiments together in one database) and test (the dataset contains separate data for training the model and testing the model, we’ll explain more about this later)

Thanks to our cypher query, we’ve simplified the problem into row-based prediction, a familiar format in the machine-learning world. The goal of our machine learning system is to predict the y values, given the x values.

In we perform some boilerplate setting up the Neo4j driver, then query and package the data up for Keras. The data is split up into batches, as required by Keras:

data =, **self.query_params).data()
data = [ (np.array(i["x"]), i["y"]) for i in data]
# Split the data up into "batches"
data = more_itertools.chunked(data, self.batch_size)
# Format our batches in the way Keras expects them = [ (np.array([j[0] for j in i]), np.array([j[1] for j in i])) for i in data]

Now that we’ve the data ready, the final steps are to build a Keras model, then train it on our data.

Keras provides a very quick and easy API for building models. Here is the model we’ll use:

model = Sequential([    Dense(6, input_shape=(12,), activation='tanh'),    Dense(6, activation='tanh'),    Dense(1, activation='tanh'),])model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])

The model has two dense layers, of width 6, with tanh activation layers. These give it room to separately combine the corresponding elements from the style_preference and style vectors.

Then an output layer of width one is applied, also with tanh activation. The network at this point has generated a single value, the prediction for the review score .

With our model built, we compile it to use the popular Adam optimizer and mean squared error as the loss function.

As briefly touched upon earlier, we split our data up into separate training data and test data. The training data is used to train the neural network and the test data is used to evaluate the performance of the trained network.

We split the data in this way because neural networks are very good at memorising the answers to their training data, without learning to predict given new data. The separate test set lets us determine if the network has truly learned to predict for unseen data.

We use the training data to train the network:

seq_train = GraphSequence(args)model.fit_generator(seq_train, epochs=10)

Then finally we evaluate its predictive capabilities using the test data:

seq_test = GraphSequence(args, test=True)
result = model.evaluate_generator(seq_test)
print(f"Accuracy: {round(result[1]*100)}%")
# Accuracy: 100.0%

The network achieves 100% accuracy on the test set. It takes 12 seconds to train on a 2015 Macbook Pro, without GPU acceleration (this is a very simple model!). This means for every datum in the test set, it correctly predicts the review score. Our challenge is complete!

Since this network can predict the products that a person would be likely to review positively — it could be used to suggest purchases to that person.

You can see the complete code here.

We’ve demonstrated connecting a Neo4j graph database to Keras and a Cypher query that formats the data for training a neural network.

We’ve build a neural network achieving 100% test accuracy on the simple review prediction task.

Next steps

If you’re interested, try introducing new features into to Person and Product nodes that influence the review score. Also, try introducing some noise into the the review score calculation and see how that affects the test accuracy. The neural network should be resilient to a moderate amount of noise, and should cope well with new features — if it’s struggling, try making the hidden layers a bit wider so it can perform more computations.

In our next articles, we will solve harder, more realistic graph problems. These will require graph-specific ML architectures.

We hope you enjoyed this article — please do let us know topics you’d like us to cover next, and feel free to ask questions.


Research into machine learning and reasoning