Elassandra is a distributed storage which built with combining
cassandra. It comes elasticsearch as a cassandra plugin. Basically it has cassandra API as well as elasticsearch API. When data save on cassandra it will automatically index on elasticsearch. It’s an ideal and powerful solutions to achieve full text search on cassandra. Read more about elassandra from here.
I’m this post I’m gonna show about deploying elassandra cluster in multi data center environment. Following are the steps to follow.
I’m gonna deploy elassandra cluster on two data centers
DC2. Each data center have two elassandra nodes, altogether there will be 4 nodes in the cluster. Following are the nodes. In each data center I have a seed node which use to bootstrap the cluster between data centers. It is a best practice to have more than one seed node per datacenter. In here I have used one seed node since two nodes available per data center.
To run elassandra it required
java 8+. In here I’m gonna install
OpenJDK 8(if want you can install oracle JDK as well, oracle JDK is the recommended version for elassandra). Following is the way to install
There are several ways to install elassandra
rpm. In here I’m gonna install with tarball. To that I need to download and extract the elassandra tarball first.
Elassnadra directory contains all the configurations and scripts that need to run the cluster. Following is the structure of
Next step is to configure the cassandra config file
cassandra.yml with cluster configurations. All the configuration files locates at
elassandra-18.104.22.168/conf directory. Following are the configurations that I have added to
cassandra.yml. Add these information to each and every node in the cluster(all nodes in
Data center config
After defining cassandra configurations I need to define the data center configurations. The data center configurations file locates at
elassandra-22.214.171.124/conf/cassandra-rackdc.properties. Following are the configurations I have added. Basically it indicate the rack and data center of the each and every node.
Remove cassandra topology
GossipingPropertyFileSnitch always loads
cassandra-topology.properties when that file is present. I have removed that file from each node.
Now everything is ready to start the cluster. I can start elassandra on each and every node one by one. Elasandra start scripts locates on
After starting cassandra on all nodes I can view the cluster status by
nodetool command which locates on
Following is the
nodetool status output of my cluster. It shows all the nodes and health informations of the cluster.
Create keyspace and table
I can connects to cassandra with
cqlsh command which locates on
elassandra-126.96.36.199/bin directory to create the keyspace and tables. Following is the way to do that.
I have created keyspace with
Replication Factor = 2. When creating the keyspace I have specified the
RF on each data center
Create elasticsearch index
Finally I can create the elasticsearch index of
connects table. Following is the way to do that with
HTTP PUT request.
I have executed this command from the node
DC1. Following is the output.
The created elasticsearch index’s replication(number of replicas) sets with the cassandra Replication Factor,
number_of_replicas = RF -1 in each datacenter.
Sharding depends on the number of nodes in the datacenter. Elasticsearch
numberOfShards is just information about the number of nodes in the data center(
numberOfShards equals to no of nodes in the data center)