Cassandra is distributed and highly available data storage. It has high write throughput and capable to write any node in the cluster. Cassandra cab be identified as
Partition Tolerance system which supports
- It does not supports full text search
- Overhead of secondary indexes which are using for additional searching
lucene index based powerful search API. It storing documents as JSON object and facilitate full text search with the document data. When storing the documents it maps individual field in document into indexed form. Elasticsearch can be identified as
Available system(its not fully
- Write supports only for
- Not ideal for primary data storage (Elasticsearch appears to
lose writesboth updates and even non-conflicting inserts)
- No multi datacenter clustering (Need to achieve multi datacenter clustering with additional tool like
Cassandra + Elasticsearch = Elassandra
Elasticsearch embedded to cassandra as a plugin. We can achieve both features in elasticsearch and cassandra on elasssandra. Above mentioned drawbacks (which are on
elasticsearch) can be overcome with elassandra). Following are some notable features.
- Full text search on cassandra data
- JSON REST API to access cassandra data
- Search on multiple keyspaces in one query
Each elassandra node comes with elasticsearch instance and cassandra instance. It provide both
cqlsh query interface and
REST API to manipulate the data. When write data to cassandra it automatically(synchronously) index in elastic(via custom secondary index). Written data first goes to in memory lucene index and periodically flush to SSTable in disk. The data written to elasticsearch available to search after refreshing the index. By default elasticsaech refresh the indexes every second(default
refresh_interval 1 second). Elassandra cluster follows masterless ring architecture(as same as cassandra). It replicates the data according to the
In elasssandra original JSON documents stored in cassandra. Only lucene indexes stored in elasticsearch(no elasticsearch
_source) . Elasticsearch index can be built with the existing the data of cassandra(basically index build from the data in
More information about elassandra can be find from here.
Following is an example scenario of elassandra. In that scenario I have used elassandra to build REST API with cassandra data. I’m creating the data via
cqlsh on cassandra and searching the data via
REST API on elasticsearch.
In here I’m creating cassandra keyspaces and tables, then syncing the cassandra keyspace in to elastic index. Finally I’m populating the data via cassandra cql api.
1. Run elassandra
2. Create keyspace
3. Create UDT and Table
4. Sync index
Below command will create index named
document on elasticsearch. It corresponds with the cassandra
5. View elasticsearch information
We can view the created index information and cluster information via elasticsearch REST API
6. Insert sample data
In order to search the data I’m using elasticsearch REST api which available with elassandra.