Graphs are data structures highly useful to understand and represent many real-world problems in all kinds of areas such as business, government, and science.
To take advantage of graph databases we don’t need to take a Masters in Graph Theory. Instead of that, we must understand what a graph is, and be able to build one by drawing it on a paper.
Mathematically speaking, a graph is just a collection of vertices and edges. Or if you don’t like math, a set of nodes and relationships that connect them. Graphs represent entities with nodes (vertices), and the way that entities relate with each other are expressed by relationships (edges).
If you stop now and think about, this structure allows us to model countless scenarios, from commercial systems to more complex problems such as optimization algorithms.
This graph model is formally known as a Property Graph. A property graph has the following characteristics:
- It contains nodes and relationships.
- Nodes contain properties (key-value pairs).
- Relationships are named and directed, and always have a start and end node.
- Relationships can also contain properties (key-value pairs).
Despite being intuitive and easy to understand, the property graph model can be used to describe almost all graph use cases.
As you probably are thinking, graph databases use the graph model to store data as a graph, with a structure consisting of vertices and edges, the two entities used to model any graph. In addition, you can use all the algorithms from the long history of graph theory to solve graph problems and in less time than using relational database queries. I’ll be covering some of them in my next posts.
Beyond the image above, and now talking specifically about Neo4j, it is an open-source graph database supported by Neo Technology, that stores data using the Property Graph model. It is reliable, with full ACID transactions, expressive, with a powerful, human-readable graph query language called Cypher, and simple, accessible by a convenient REST interface or an object-oriented Java API.
Enough theory and talking for now. Let’s prepare our environment to play a little with Neo4j, and build a simple Rails application.
Installing Neo4j on development machines is very easy. If you are on OSX and is using brew, go ahead and issue
brew install neo4j on a terminal window.
Or, if you prefer, follow these five steps:
- Download the Neo4j Community package.
- Unzip on your installations folder, let’s say
- Create a symbolic link named
neo4jto the unzipped folder. For instance:
ln -s ~/Applications/neo4j-community-2.1.2 ~/Applications/neo4j.
- Create an environment variable named
NEO4J_HOMEpointing to this symbolic link.
- Change the
PATHenvironment variable, adding the
This way, in the future, when you want to update the Neo4j database on your machine, you can just download the new version, unpack, and update the symbolic link pointing it to the new version.
When you have it installed, open a terminal window and type:
neo4j start. This command will start the Neo4j server on your machine. Now go check it on your browser accessing
http://localhost:7474/. You’ll be presented with a super nice administration panel, where you can visualize the data stored on your Neo4j instance, manipulate data using the Cypher Query Language, check all instance configuration and more.
Neo4j is built on top of Java and the rock-solid JVM. As we want to use (MRI) Ruby on Rails here, let’s connect our app using Neo4j’s REST API.
To make things simpler, we’ll use the awesome gem (surprisingly) called neo4j from @andreasronge. The version 2.x is the stable version. But here we will use it directly from the master branch where version three is under active development, and which enable us to use the MRI Ruby connecting to Neo4j via its REST interface. If you are into JRuby, you can even use the stable version and connect using the embedded DB (by filesystem), which means a Neo4j instance running on the same JVM of your app.
But here we will use the first one. Go ahead, add the reference to your
bundle install it:
gem 'neo4j', github: 'andreasronge/neo4j'
Let’s start with a dead simple app. Two models:
Music. One artist can interpret many pieces of music, and a music belongs to an artist.
I will not paste the application code here on this blog post since we are using the alpha version of the neo4j gem, and much of the code could become outdated quickly. Instead of replicating code here, you can check the live demo which is running on Heroku, and the updated source code on my Github account.
Before you dive into the demo application code, just let me highlight some key points about the usage of the neo4j gem on a Rails app. I bootstrapped the app with Pah gem, and started learning (the hard way) how to make things work. So here are the main points that need your attention:
This blog post intended to introduce you to the world of Graph Databases, giving some theory about graphs and a practical hands-on using Neo4j and Rails. Although the graph model of the demo application looks very simple, much can be learned until here.
For future posts expect to read more about Neo4j, Cypher Query Language, and traversal algorithms.
So, what about learning by doing? I invite you to clone the sample app and start hacking it right away! Add some feature, improve the graph model in some way. Pull requests are welcome!
And remember: graphs are everywhere!
Originally published at tomasmuller.com.br on July 24, 2014.