Elasticsearch + Cassandra = Elassandra

Cassandra search difficulties can be solved with Elassandra.

Elassandra is a fork of Elasticsearch modified to run as a plugin for Apache Cassandra in a scalable and resilient peer-to-peer architecture. Elasticsearch code is embedded in Cassanda nodes providing advanced search features on Cassandra tables and Cassandra serve as an Elasticsearch data and configuration store.

Elassandra “Hello World” Example in Docker

  1. docker run --name elassandra-1 -d strapdata/elassandra
  2. docker run --name elassandra-2 -d -e CASSANDRA_SEEDS="$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' elassandra-1)" strapdata/elassandra
  3. docker exec -it elassandra-1 bash
  4. cqlsh
  5. create keyspace if not exists twitter with replication={ 'class': 'NetworkTopologyStrategy', 'DC1': '1' };
  6. create table twitter.user ( 
     username text, 
     mail text, 
     attrs map<text, text>, 
     primary key (username) 
    );

10. exit

Let’s index our Cassandra table with Elasticsearch:

  1. Create an index for our Cassandra table in Elasticsearch.
curl -XPUT "http://localhost:9200/twitter" -d '{
"settings" : { "keyspace" : "twitter" } },
"mappings":{
"user":{
"properties":{
"username":{
"type":"string",
"index":"not_analyzed"
},
"mail":{
"type":"string",
"index":"not_analyzed"
}
}
}
}
}'

And the response of the PUT request will be something like this:

{"acknowledged":true,"shards_acknowledged":true}

2. Discover users

curl -XPUT "http://localhost:9200/twitter/_mapping/user" -d '{
"user" : {
"discover" : "[a-zA-Z].*",
"properties" : {
"name" : {
"type" : "string",
"index" : "analyzed"
}
}
}
}'

Let’s add some records to Cassandra:

  1. cqlsh
  2. insert into twitter.user (username, mail, attrs) values ('cilesizemre', 'cilesizemre@gmail.com', {'name': 'Emre', 'surname': 'Cilesiz'});
  3. exit

Let’s search:

curl 'localhost:9200/twitter/user/_search?pretty=true&q=mail:cilesizemre@gmail.com'

And the response will be something like this:

{
"took" : 11,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"hits" : {
"total" : 1,
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "twitter",
"_type" : "user",
"_id" : "cilesizemre",
"_score" : 0.2876821,
"_source" : {
"mail" : "cilesizemre@gmail.com",
"attrs" : {
"name" : "Emre",
"surname" : "Cilesiz"
},
"username" : "cilesizemre"
}
}
]
}
}

As a final note: Don’t forget, Elessandra uses Cassandra secondary indexes. This can cause performance problems.