TITAN DB SETUP WITH CASSANDRA

Knoldus Inc.

Published in

Knoldus - Technical Insights

4 min readJun 20, 2016

TITAN DB SETUP WITH CASSANDRA

CONNECT AND CONFIGURATION OF TITAN-DB WITH CASSANDRA:-

Step 1:
Download Cassandra Version: apache-cassandra-3.5 .
Downlaod TitanDB Version: titan-1.0.0-hadoop1.

Step 2 :
Extract both downloads ,say in :-
/var/lib/cassandra and /var/lib/titan respectively.

Step 3: Configure cassandra:
If you’ve installed Cassandra with a deb or rpm package, the directories that Cassandra will use should already be created an have the correct permissions. Otherwise, you will want to check the following config settings from conf/cassandra.yaml: data_file_directories (/var/lib/cassandra/data), commitlog_directory (/var/lib/cassandra/commitlog), and saved_caches_directory (/var/lib/cassandra/saved_caches). Make sure these directories exist and can be written to.

By default, Cassandra will write its logs in /var/log/cassandra/. Make sure this directory exists and is writeable, or change this line in conf/log4j-server.properies:

log4j.appender.R.File=/var/log/cassandra/system.log

Note that in Cassandra 2.1+, the logger in use is logback, so change this logging directory in your conf/logback.xml file such as:

<file>/var/log/cassandra/system.log</file>

JVM-level settings such as heap size can be set in conf/cassandra-env.sh.

Step 3: Start Cassandra

And now for the moment of truth, start up Cassandra by invoking ‘bin/cassandra -f’ from the command line1. The service should start in the foreground and log gratuitously to the console. Assuming you don’t see messages with scary words like “error”, or “fatal”, or anything that looks like a Java stack trace, then everything should be working.

Press “Control-C” to stop Cassandra.

Step 4:-
Configure Titan to run backend as cassandra:-

you can use titan with cassandra in two ways :-
4.1) The default inbuilt cassandra and cassandrathrift — DB available with titan, where it will run the node tool and elastic search(for index storage back end db) automatically :-

:To run titan using the default configuration you need to run the :-
bin/titan.sh start
Forking Cassandra…
Running `nodetool statusthrift`. OK (returned exit status 0 and printed string “running”).
Forking Elasticsearch…
Connecting to Elasticsearch (127.0.0.1:9300)…. OK (connected to 127.0.0.1:9300).
Forking Gremlin-Server…
Connecting to Gremlin-Server (127.0.0.1:8182)…. OK (connected to 127.0.0.1:8182).
Run gremlin.sh to connect.

Then Connect to gremlin :-
bin/gremlin.sh
\,,,/
(o o)
— –oOOo-(3)-oOOo — –
plugin activated: aurelius.titan
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
11:05:19 INFO org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph — HADOOP_GREMLIN_LIBS is set to: /var/lib/titan/titan-1.0.0-hadoop1/lib
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.tinkergraph
gremlin> Gremlin.version()
==>3.0.1-incubating
gremlin> Titan.version()
==>1.0.0

4.2)Second way is to Configure your install cassandra with titan to use as backend.

To configure your installed Cassandra you need to edit the properties file present in conf folder.

Edit conf/titan-cassandra-es.properties to be as follows:

gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
# Other values: cassandrathrift, astyanax (synonym: cassandra), embeddedcassandra, inmemory
storage.backend=cassandra
storage.hostname=127.0.0.1

# Index backend
index.search.backend=elasticsearch
index.search.directory=/tmp/es
index.search.elasticsearch.local-mode=true
index.search.elasticsearch.client-only=false

Start Cassandra
Go in cassandra director till bin path :-
In another command window start Cassandra using cassandra -f command.

➜ ~ cassandra -f

Then we need to start the NODETOOL Server :-

Go in cassandra director till bin path :-
>bin/nodetool statusthrift
not running

>/var/lib/cassandra/apache-cassandra-3.5$ bin/nodetool enablethrift

>/var/lib/cassandra/apache-cassandra-3.5$ bin/nodetool statusthrift
running

Start Your Titan Server :-
YOU DONT NEED TO START THE THE TITAN SERVER THIS TIME ,
AS MANUALLY YOU HAVE STARTED CASSANDRA , NODETOOL , AND IF REQUIRED(ELASTIC SEARCH/SOLR).

Connect with titan Db using the Gremlin Console :-
bin/gremlin.sh
\,,,/
(o o)
— –oOOo-(3)-oOOo — –
plugin activated: aurelius.titan
plugin activated: tinkerpop.server
plugin activated: tinkerpop.utilities
plugin activated: tinkerpop.hadoop
plugin activated: tinkerpop.tinkergraph

gremlin> Gremlin.version()
==>3.0.1-incubating

gremlin> Titan.version()
==>1.0.0

//HERE YOU ARE SPECIFYING THAT YOU WANT TO CREATE A GRAPH AND ITS KEYSPACE IN CASSANDRA
gremlin> g = TitanFactory.open(‘conf/titan-cassandra-es.properties’)
==>standardtitangraph[cassandra:[127.0.0.1]]

/********* LETS SEE THE RESULT IN CASSANDRA ************/
Check the cassandra log or on cassandra console , and you will see that a titan keyspace is created .:-
cqlsh:titan> describe keyspaces;

titan system_auth mykeyspace system_traces
system_schema system system_distributed

cqlsh:>use titan;

cqlsh:titan>
cqlsh:titan> describe tables;

titan_ids edgestore system_properties_lock_
edgestore_lock_ graphindex_lock_ graphindex
txlog systemlog system_properties

/********* NOW LETS RETURN BACK TO GREMLIN SERVER ************/

gremlin> g1 = g.traversal()
==>graphtraversalsource[standardtitangraph[cassandra:[127.0.0.1]], standard]

gremlin> g1.V()
gremlin> g1.V().count()
==>0

Create Verties:-

[code language=”scala”]
v1 = g.addVertex(T.label, “person”, “name”, “marko”, “age”, 29)
v2 = g.addVertex(T.label, “software”, “name”, “lop”, “lang”, “java”)

Create An edge between above two created vertices:-
gremlin> v1.addEdge(“created”, v2, “weight”, 4)
==e[2rl-360–4r9–38g][4104-created-;4192]

gremlin> g1.V().has(‘name’,’marko’).values(‘name’)
==marko
gremlin> g1.V().has(‘name’,’marko’)
==v[4104]
[/code]

/********* LETS SEE THE RESULT IN CASSANDRA ************/
Check the cassandra log or on cassandra console , and you will see that a titan keyspace is created .:-
cqlsh:titan> describe keyspaces;

titan system_auth mykeyspace system_traces
system_schema system system_distributed

cqlsh:>use titan;

cqlsh:titan>
cqlsh:titan> describe tables;

cqlsh:titan> select * from titan_ids;

[code language=”scala”]
key | column1 | value
— — — — — — –+ — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — + — — -
0x0000000000000003 | 0xfffffffffffec77f000535233381db083766303030313031373330372d6b6e6f6c6475732d566f7374726f2d3335353832 | 0x
0x6000000000000003 | 0xfffffffffffec77f0005352337d4aac83766303030313031373330372d6b6e6f6c6475732d566f7374726f2d3335353832 | 0x
0x6000000000000000 | 0xffffffffffffd8ef0005352337cf82783766303030313031373330372d6b6e6f6c6475732d566f7374726f2d3335353832 | 0x
0x0000000000000004 | 0xffffffffffffff9b00053523337cc2583766303030313031373330372d6b6e6f6c6475732d566f7374726f2d3335353832 | 0x
0x0000000000000004 | 0xffffffffffffffcd0005352333779a083766303030313031373330372d6b6e6f6c6475732d566f7374726f2d3335353832 | 0x
0x0800000000000000 | 0xffffffffffffd8ef000535233387b7083766303030313031373330372d6b6e6f6c6475732d566f7374726f2d3335353832 | 0x
0x0800000000000003 | 0xfffffffffffec77f00053523338cd7883766303030313031373330372d6b6e6f6c6475732d566f7374726f2d3335353832 | 0x
[/code]

SOme Test Analysis on gremlin using the Built-In-Graphs
gremlin> g = TinkerFactory.createClassic()
==>tinkergraph[vertices:6 edges:6]
gremlin> input = g.V(2,3,4).toList()
==>v[2]
==>v[3]
==>v[4]
gremlin> sm = g.of().inject(input).unfold().in().groupCount().next().sort { -it.getValue() }
==>v[1]=3
==>v[4]=1
==>v[6]=1
gremlin> m = sm.iterator().next().getValue()
==>3
gremlin> result = sm.grep { it.getValue() == m }*.getKey()
==>v[1]

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

TO Display out from titan in json format:-

gremlin> graph = TitanFactory.open(‘conf/titan-cassandra-es.properties’)
==>standardtitangraph[cassandra:[127.0.0.1]]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> f = new FileOutputStream(“vertex-1.json”)
==>java.io.FileOutputStream@702b55af
gremlin> graph.io(graphson()).writer().create().writeVertex(f, g.V(4120).next(), BOTH)
==>null
gremlin> f.close()
==>null
OUTPUT:_

[code language=”scala”]
{“id”:4120,”label”:”users”,”properties”:{“birthdate”:[{“id”:”1z7–36g-35x”,”value”:”02–09–1991″}],”last_activity”:[{“id”:”35v-36g-5j9″,”value”:”0512541454″}],”last_longitude”:[{“id”:”3yb-36g-745″,”value”:”263.2365″}],”last_elivation”:[{“id”:”4cj-36g-7wl”,”value”:”452.3220″}],”gender”:[{“id”:”2df-36g-3yd”,”value”:”female”}],”nickname”:[{“id”:”1kz-36g-2dh”,”value”:”testuser2″}],”registration”:[{“id”:”4qr-36g-8p1″,”value”:”4569856455″}],”id”:[{“id”:”16r-36g-1l1″,”value”:”user2″}],”picture”:[{“id”:”2rn-36g-4qt”,”value”:”/var/lib/titan/pictures/user2.png”}],”last_latitude”:[{“id”:”3k3–36g-6bp”,”value”:”265.245″}]}}
[/code]

check the output more clearly at:-
http://jsonviewer.stack.hu/

///in xml format and view in cytoScape
g = TitanFactory.open(‘cassandra:127.0.0.1’)
os = new FileOutputStream(“UserGraph.xml”)
g.io(graphml()).writer().normalize(true).create().writeGraph(os, g)

TITAN DB SETUP WITH CASSANDRA

TITAN DB SETUP WITH CASSANDRA

Written by Knoldus Inc.