If you’ve seen any of my previous blog posts (over at Kotan Code) then you’ll know that I’ve had quite a bit of experience with graph databases, graph database design, and more specifically, experience with Neo4j.
Lately I’ve been thinking about different ways to solve some storage problems and again graph databases came up. My problem with Neo4j is that out of the box (the free version), it doesn’t scale horizontally and none of the enterprise features are available. Worse, the enterprise version costs a lot of money… like Microsoft enterprise money.
So when I went looking I stumbled accidentally on this server I’d never heard of before, OrientDB. Apparently it’s pretty huge in the EU but has yet to take the US the way Neo4j has. What I found intriguing about OrientDB is that it claims to be the best of all worlds: a document-oriented database to rival MongoDB, and a graph-oriented database to rival Neo4j, without having to make any sacrifices to accomplish this feat.
I’m a natural born skeptic and so when someone makes a claim like this, I have to investigate. It’s been my experience that when something claims to be the best of all worlds, the reality is they end up being mediocre at many things, having spread themselves too thin. The software usually feels awkward because you can’t really tell what it wants to do best; it suffers from split personality syndrome.
So you can operate in full schema, no schema, and hybrid/mixed schema modes. I opted for mixed schema, where I can choose the areas of tight enforcement and the areas that are loosely enforced or not enforced at all.
With OrientDB, you get two root classes (oh yeah, it’s got OO concepts like classes and inheritance, too…): the Vertex and the Edge. Seems pretty simple so far. I decided to go for a test domain of Interactive Fiction (remember Zork?!) where you’re essentially modeling an interactive book as a state tree with transitions between caused by player/reader choices. Seemed ideal for a sample graph, so I created two classes that inherit from Vertex: State and Examinable.
I could use their web GUI that looks as crisp and easy to use as Neo4j’s web GUI to graphically click around creating vertices and edges but I decided to do it in SQL to see what it felt like.
Here I create a vertex of class State and set two properties (both of which I’ve indicated are mandatory in the schema):
create vertex State set shortDescription = ‘Foyer’,
longDescription = ‘You find yourself in the foyer. There is a door heading north.’
After I’ve created a couple of state vertices, I can now create an edge. The SQL syntax for this is remarkably simple:
create edge Direction from #12:2 to #12:3 set name=’north’
Now I want to create a new Examinable vertex and I want to use edges coming from the state to the examinable with a noun, to allow for commands like “examine light” or “examine lamp”:
create vertex Examinable set description = ‘This is a lamp’
create edge Look from #12:2 to #15:0 set noun=’light’
create edge Look from #12:2 to #15:0 set noun=’lamp’
So now in my hypothetical interactive fiction game engine, if I want to gather up the list of things that can be looked at from the current state, I can just use a query like the following:
select noun from (traverse out_Look from #12:2) where @class=”Look”
The traverse command does what you think — follows a link or edge in a given direction (in my case “out”). When you traverse, one of the fields included in the traversal is the class of the edge traveled, exposed as the @class field.
So now let’s assume someone enters “examine lamp” either by typing it or, if they’re on a mobile device, maybe they used a radial menu where the nouns were populated with my previous query. To get the description of the light, I need to traverse out on the “Look” edge, filter my results where the noun is “lamp”, and then grab the description from that. The query looks like this:
select description from (
traverse in from (
select * from (traverse out_Look from #12:2) where noun=’lamp’
)) where @class=’Examinable’
So this looks a little awkward, especially when compared to the ASCII art elegant of some Cypher queries I’ve written for Neo4j. But it’s still somewhat easy to follow if you work outward from the inside query.
First, I grab everything from traversing the “Look” outbound edges that have the noun of “lamp”. Then, I traverse the in link from that (edges have an out and an in, the trick is figuring out which to use when) which will get me the examinable vertex I want. I have to filter to include just the examinable vertices because in any OrientDB traversal, the origin/root node is always included so I have to skim that back off the top when I’m done in order to get the description I want.
Overall I’m impressed with OrientDB and its performance metrics look pretty amazing, either on par with or better than Neo4j. I am using the SQL syntax but I’ve only had a few hours of exposure so I’m betting the above query could be refactored into something more elegant by people who know more idiomatic ways of doing things.
I’m definitely going to continue looking into OrientDB as an alternative to Neo4j, especially given the control over schema enforcement, the object inheritance model that translates directly to binding to Java POJOs using the OrientDB API (OrientDB is written in Java), and obviously the price.