Elassandra

Cassandra + Elasticsearch, The marriage made in heaven

λ.eranga
λ.eranga
Apr 18, 2018 · 3 min read

Cassandra

About cassandra

Cassandra is distributed and highly available data storage. It has high write throughput and capable to write any node in the cluster. Cassandra cab be identified as Available and Partition Tolerance system which supports Eventual Consistency.

Concerns

  1. It does not supports full text search
  2. Overhead of secondary indexes which are using for additional searching

Elasticsearch

About elasticsearch

Elasticsearch is lucene index based powerful search API. It storing documents as JSON object and facilitate full text search with the document data. When storing the documents it maps individual field in document into indexed form. Elasticsearch can be identified as ConsistentandAvailablesystem(its not fully Partition Tolerance)

Concerns

  1. Write supports only for master node
  2. Not ideal for primary data storage (Elasticsearch appears to lose writes both updates and even non-conflicting inserts)
  3. No multi datacenter clustering (Need to achieve multi datacenter clustering with additional tool like kafka)

Elassandra

Cassandra + Elasticsearch = Elassandra

Elasticsearch embedded to cassandra as a plugin. We can achieve both features in elasticsearch and cassandra on elasssandra. Above mentioned drawbacks (which are on cassandra and elasticsearch) can be overcome with elassandra). Following are some notable features.

  1. Full text search on cassandra data
  2. JSON REST API to access cassandra data
  3. Search on multiple keyspaces in one query

Architecture

Each elassandra node comes with elasticsearch instance and cassandra instance. It provide both cqlsh query interface and REST API to manipulate the data. When write data to cassandra it automatically(synchronously) index in elastic(via custom secondary index). Written data first goes to in memory lucene index and periodically flush to SSTable in disk. The data written to elasticsearch available to search after refreshing the index. By default elasticsaech refresh the indexes every second(default refresh_interval 1 second). Elassandra cluster follows masterless ring architecture(as same as cassandra). It replicates the data according to the replication factor.

In elasssandra original JSON documents stored in cassandra. Only lucene indexes stored in elasticsearch(no elasticsearch _source) . Elasticsearch index can be built with the existing the data of cassandra(basically index build from the data in SSTable).

More information about elassandra can be find from here.

Elassandra example

Following is an example scenario of elassandra. In that scenario I have used elassandra to build REST API with cassandra data. I’m creating the data via cqlsh on cassandra and searching the data via REST API on elasticsearch.

Configure elasassandra

In here I’m creating cassandra keyspaces and tables, then syncing the cassandra keyspace in to elastic index. Finally I’m populating the data via cassandra cql api.

1. Run elassandra

2. Create keyspace

3. Create UDT and Table

4. Sync index

Below command will create index named document on elasticsearch. It corresponds with the cassandra storage_documents keyspace.

5. View elasticsearch information

We can view the created index information and cluster information via elasticsearch REST API

6. Insert sample data

Search elassandra

In order to search the data I’m using elasticsearch REST api which available with elassandra.

1. Single boolean match

2. Multi boolean match

3. Nested match

4. Boolean match with nested match

5. Pagination

6. Sort

7. Pagination with Sort

References

  1. http://elassandra.readthedocs.io/en/latest/
  2. https://github.com/strapdata/elassandra
  3. https://simongui.github.io/2016/07/20/elassandra.html
  4. https://medium.com/@itseranga/cassandra-lucene-queries-with-udt-e5d1a10d2b9c

Rahasak

Have less, be more

λ.eranga

Written by

λ.eranga

Scala, Golang with Vim and Hockey: What else does a man need to be happy :)

Rahasak

Rahasak

Have less, be more

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade