Learn Grakn with Java

Modelling the Simpsons’ Family Tree

Jo Stichbury
Vaticle
9 min readMar 23, 2017

--

The Simpsons (Image by Loren Javier: CC BY-ND 2.0)

So far on this blog, we have mostly given coding examples that use our declarative query language, Graql, though we’ve also briefly covered Python, R and Haskell. In this blog post, I’m going to share some details of how to use Java in a basic example that can be extended as a template for your own projects. I’ll describe briefly how to get set up, then how to build up an ontology, add data and how to make some queries. To dive further into developing on Grakn with Java, I recommend you also take a look at the documentation on the Client Java API.

What is Grakn?

Grakn is the database for AI. AI systems need a knowledge base to manage their data because they produce and consume more complex information than average software. Grakn is a database in the form of a distributed knowledge base with a reasoning query language that allows you to model, verify, scale, query and analyse complex data easily. By providing a query language that uses machine reasoning to uncover knowledge too complex to infer with human cognition, Grakn allows organisations to grow their competitive advantage, all the while reducing engineering time, cost, and complexity.

Why Java?

The Grakn codebase is written in Java, and is open-source and available on Github. You have a natural advantage if you use the Java APIs to work with a graph:

  • you can step into the code to investigate how it works.
  • you can work closer to the ‘metal’ of the platform.
  • you get to use Java, which is a widely-adopted, general-purpose language.

However, there is a drawback to working with Java, namely that the Graql language is designed deliberately for querying so has a set of keywords and an intuitive syntax when compared to the more verbose Java APIs.

Setting up

The first step is to download the latest version of Grakn, and unzip it into your preferred location. Please be aware that at present — on version 0.12.0 — we support development on MacOS and Linux only. We do aim to provide support for Windows in the future.

Grakn requires Java 8 (Standard Edition) with the $JAVA_HOME set accordingly. If you don’t already have it installed, you can find it here.

If you intend to build the Grakn codebase, or develop on top of it, you will also need Maven 3. All Grakn applications have the following Maven dependency:

This dependency will give you access to the Core API. It will also allow you to use an in-memory graph, which can be handy if you want to test something on the stack without having to have an instance of the Grakn server running. However, in most cases, you will want to persist to the graph and access the entirety of the Grakn stack. For that, it is vital to have an instance of Grakn engine running, which means that you need to run the following in the terminal:

This will start:

  • an instance of Apache Cassandra, which currently serves as the supported backend for Grakn, although we may add others in future.
  • Grakn engine, which is an HTTP server providing batch loading, monitoring and the browser dashboard.

Useful commands

  • To start Grakn, run grakn.sh start
  • To stop Grakn, run grakn.sh stop
  • To remove all graphs from Grakn, run grakn.sh clean

Grakn Engine is configured by default to use port 4567, but this can be changed in the grakn-engine.properties file, found within the /conf directory of the installation.

For further information about getting set up, or if you run into any problems with the above, please see the Setup Guide in the Grakn documentation.

Your Java application will require the following dependency when it is running against a Titan backend, which is what is configured for you by default:

If you want your server to run against a OrientDB backend instead, substitute orientdb-factory within <artifactId></artifactId> instead of titan-factory. Please note that, at this time, OrientDB support is still in early stages of development.

Hello Simpsons example

The example we will build is very simple, and takes inspiration from the genealogy example we have used throughout our documentation. We have kept it very simple (as close to a Hello World as you can get while still being useful as a template for creating and querying a graph) and we’ve used The Simpsons to provide family data. You can find it in our sample-projects repository on Github and it will also be built into the Grakn distribution zip file from our next release (v0.12.0).

If you are running the Grakn server locally then you can initialise a graph:

If you are running the Grakn server remotely you must initialise the graph by providing the IP address and port number of your server as the first parameter:

Note that the string keyspace uniquely identifies the graph and allows you to create different graphs. The keyspace default is “Grakn”. The parameter is not case sensitive so the following two graphs are the same:

When you obtain a graph from a factory, you are obtaining a transaction, which ultimately must be closed or aborted. The graphs are effectively singletons specific to their keyspaces, so be aware that in the following example, changes to graph1, graph2, or graph3 will all be persisted to the same graph:

Graph API: GraknGraph

The Graph API, GraknGraph, is a low-level API that encapsulates the Grakn knowledge model. It provides Java object constructs for the Grakn ontological elements (entity types, relation types, etc.) and data instances (entities, relations, etc.), allowing you to build up a graph programmatically. It is also possible to perform simple concept lookups using the graph API, which I’ll illustrate presently. First, let’s look at building up the graph.

Building an ontology

Let’s see how we can build a simple ontology using the Graph API. We will look at the same ontology as is covered in the Basic Ontology documentation using Graql, which you may already be familiar with. If you’re not, the ontology is fully specified in Graql here. First we need a graph:

Building the ontology is covered in writeOntology(). I won’t reproduce it all here, but add some chunks in summary. First, the method adds the resource types using putResourceType():

Then it adds roles using putRoleType():

Then to add the relation types, putRelationType(), which is followed by relates() to set the roles associated with the relation and resource() to state that it has a date resource:

Finally, entity types are added using putEntityType() followed by plays() and resource():

Now to commit the ontology:

Loading data

Now that we have created the ontology, we can load in some data using the Graph API. We are going to add some Simpsons.

The example project does this in writeSampleRelation_Marriage(). First it creates a person entity named homer:

We can compare how a Graql statement maps to the Graph API. This is the equivalent in Graql:

The code goes on to create another person entity, named marge, and then marries them, setting a roughly approximate date (the actual date that the Simpsons married has never been shared):

Querying the graph using GraknGraph

The runSampleQueries() method shows how to run a simple query using the GraknGraph API. The first query is What are the instances of type person? which in Graql is simply match $x isa person;

Querying the graph using QueryBuilder

It is also possible to interact with the graph using a separate Java API that forms Graql queries. This is via GraknGraph.graql(), which returns a QueryBuilder object, discussed in the documentation here. It is useful to use QueryBuilder if you want to make queries using Java, without having to construct a string containing the appropriate Graql expression. Taking the same query: What are the instances of type person?

Which leads us to the common question…

When would you use GraknGraph and when would you use QueryBuilder?

Graph API:

If you are primarily interested in mutating the graph, as well as doing simple concept lookups the Graph API will be sufficient, e.g. for

  • Manipulation, such as inserting Lisa and Maggie Simpson into the graph.
  • Advanced construction of an ontology, such as adding additional relations, such as Mr Burns as an employer.

QueryBuilder — the “Java Graql” API:

This is best for advanced querying where traversals are involved. For example “Who is married to Homer?” is too complex a query for the Graph API. Using a QueryBuilder:

Visualising the graph

Now is a good time to look at the graph! The Grakn visualiser provides a graphical tool to inspect and query your graph data. You can open the visualiser by navigating to localhost:4567 in your web browser. The visualiser allows you to make queries or simply browse the knowledge ontology within the graph.

The screenshot below shows a basic query (match $x isa person; offset 0; limit 100) typed into the form at the top of the main pane, and visualised by pressing “>” to submit the query:

You can zoom the display in and out, and move the nodes around for better visibility. Please see our Grakn visualiser documentation for further details.

What else can I do with this example?

This example has been created, as much as anything, as a template that you can take to form the basis of your own projects. Feel free to add some more people to the graph, or make some additional queries. If you need some ideas, you’ll find extra examples of using Java Graql in the Graql documentation for match, insert, delete and aggregate queries.

What else can I do with Grakn’s Java APIs?

We also provide APIs for migrating data into Grakn, which are discussed, with examples, in the documentation here.

There is also a Java client for loading large quantities of data into a Grakn graph using multithreaded batch loading. The loader client operates by sending requests to the Grakn REST Tasks endpoint and polling for the status of submitted tasks, so you no longer need to implement these REST transactions. The loader client additionally provides a number of useful features, including batching insert queries, blocking, and callback on batch execution status. Configuration options allow the user to finely-tune batch loading settings. Further information can be found in the documentation here.

What other languages can I use with Grakn?

At present, we have some support for Python, R and Haskell, although coverage is incomplete at this stage of development.

Where can I find out more?

Image from http://simpsons.wikia.com/wiki/List_of_chalkboard_gags

To find out more, take a look at our documentation on the Java APIs, and our growing set of Javadocs, which are under development.

We are always happy to help. A good way to ask questions is via our Slack channel. We also have a discussion forum. For news, sign up for our community newsletter and — if you’d like to meet us in person — we run regular meetups.

If you enjoyed this post, please take the time to recommend it or leave us a comment. I would like to thank Borislav Iordanov and Nicholas D for their contributions, and I am particularly grateful to Filipe Pinto Teixeira for writing the code and Grakn Labs documentation it uses.

--

--

Jo Stichbury
Vaticle

Technical content creator writing about data science and software. Old-school Symbian C++ developer, now accidental cat herder and goose chaser.