Getting Set Up with Kettle and Neo4j

Matt Casters
Dec 4, 2018 · 3 min read

Neo4j is an awesome piece of technology. The ability to use cutting edge graph algorithms like shorted path, community detection, centrality while running transactional operations is only possible in any meaningful and performing way on a fully native graph database like Neo4j.

So how do you get started? Well, you need to load data into Neo4j first and this is where a data integration tool like Kettle might come in.

What is Kettle?

Kettle is an open source data integration (ETL) platform with the ability to visually design your work. It has been around for over 17 years and in that time it became quite mature, stable, high performing and feature rich.

From Kafka to Neo4j

For example, here is a visual representation of a Kettle “transformation” which does the following:

  • Receive messages from Kafka
  • Do look-ups in MongoDB
  • Write nodes and relationships to Neo4j

The technical aspect of setting up a transformation like this takes a few minutes at most so you can focus on things like data quality, the graph model, the requirements, data accuracy, performance and so on.

As you can imagine doing this without the need for coding, scripting or anything like that can be very time-saving for the set-up and the maintenance of your solution later on.

So how do you get started with Kettle itself and what about the Neo4j plugins?

Kettle download and installation

You can download Kettle (also known as “Pentaho Data Integration Community Edition“ — waaay too long to say everytime so we just say “Kettle”) from SourceForge, the Pentaho project, latest version 8.1 (right now, check for updates) and look for PDI-CE : it’s about 1GB bundled with all plugins!

Make sure you have the right Java 8/9 runtime environment properly installed for your computer system. Java from Oracle or OpenJDK is recommended. Kettle runs fine on Windows, OSX or Linux.

Now unzip the downloaded archive somewhere. It will give you an extra folder. This is all you need to do for as far as Kettle is concerned.

Install the Neo4j plugins for Kettle

You can get the latest version of the plugins at It points to the community project where the Neo4j Kettle plugins were first developed and where our improvements are done. From the releases download the latest version archive: Neo4JOutput-<version>.zip

Unzip this where you placed your Kettle distribution in the folder.

Now the fun starts

Now you can start up the Kettle GUI called The naming is a silly pun on Kettle and anything kitchen related.

  • on Windows start
  • on OSX start or the app: “Data”
  • on Linux start

You will notice a welcome page with useful links:

The Spoon Welcome page

You can find the Neo4j plugins when you create a new transformation:

The Neo4j steps category in Spoon

Next steps

Here are a few things you can do to read up on Kettle and Neo4j and some pointers on where to get help:

Stay tuned for the next story in which I’ll be going over a few concrete examples of data loading into Neo4j.

Enjoy Kettle!


Neo4j Developer Blog

Developer Content around Graph Databases, Neo4j, Cypher…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store