Mert Çalışkan
3 min readApr 13, 2015

Running Cassandra Cluster on Docker

Spotify offers Cassandra docker image to run as either single node or as clustered. The image does not employ vnodes so some pre-configuration is needed, such as setting some environment variables, before running the containers. You’ll find step-by-step way of getting up a Cassandra cluster in this entry.

Keep in mind that copy paste of code snippets might arise problems on “ and ”. Do replace them after copying.

First pull the image with:

docker pull spotify/cassandra:cluster

which will fetch an image of 900mb so beware that you are not on 3G ☺. Then we need to define CASSANDRA_TOKEN environment variable for running our first node. The tokens should differ per node and since we are not using vnodes of Cassandra (more on vnodes coming in another entry), we should specify them manually. For hashing, the tokens will be in the range of -2^63 to +2^63.

The phyton script below will create 3 tokens in between as:

python -c ‘print [str(((2**64 / 3) * i) — 2**63) for i in range(3)]’[‘-9223372036854775808’, ‘-3074457345618258603’, ‘3074457345618258602’]

To run our first node:

docker run -d -v /var/lib/cassandra/c1:/var/lib/cassandra -e “CASSANDRA_TOKEN=-9223372036854775808” --name c1 spotify/cassandra:cluster

We’ll set this first node as seed node where newcomers will gossip to it to register themselves. So we need the IP out of it.

To get the IP of the first node:

docker inspect -f ‘{{.NetworkSettings.IPAddress}}’ c1

which will result with an IP something like: 172.17.0.3.

To start 2nd and 3rd nodes, the docker commands will be as follows:

docker run -d -v /var/lib/cassandra/c2:/var/lib/cassandra -e “CASSANDRA_TOKEN=-3074457345618258603” -e “CASSANDRA_SEEDS=172.17.0.3” --name c2 spotify/cassandra:clusterdocker run -d -v /var/lib/cassandra/c3:/var/lib/cassandra -e “CASSANDRA_TOKEN=3074457345618258602” -e “CASSANDRA_SEEDS=172.17.0.3” --name c3 spotify/cassandra:cluster

After creating all nodes, you should see the running status with docker ps command as:

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES7fb20068c639 spotify/cassandra:cluster “cassandra-clusterno 31 seconds ago Up 28 seconds 7199/tcp, 8012/tcp, 9042/tcp, 9160/tcp, 22/tcp, 61621/tcp, 7000/tcp, 7001/tcp c3a4067e2b9c6b spotify/cassandra:cluster “cassandra-clusterno About a minute ago Up About a minute 22/tcp, 61621/tcp, 7000/tcp, 7001/tcp, 7199/tcp, 8012/tcp, 9042/tcp, 9160/tcp c2620b848522e8 spotify/cassandra:cluster “cassandra-clusterno 2 minutes ago Up 2 minutes 22/tcp, 61621/tcp, 7000/tcp, 7001/tcp, 7199/tcp, 8012/tcp, 9042/tcp, 9160/tcp c1

now it’s time to bash into one of the nodes (c1 in our case).

docker exec -it c1 /bin/bash

Run csqlsh with:

cqlsh 172.17.0.3

which will fallback you to the command prompt. Since there is no keyspace (rdmbs eq: database), let’s create one with name UTS:

create keyspace UTS with replication = {‘class’ : ‘SimpleStrategy’, ‘replication_factor’:3};

let’s use this keyspace with:

use UTS;

now we can create our simplest test table as:

CREATE TABLE MyTable (
id text,
value text,
PRIMARY KEY (id)
);

Running

select * from MyTable;

will result in 0 rows.

Let’s insert some data with:

INSERT INTO MyTable (id, value) VALUES (‘1’, ‘Mert’);
INSERT INTO MyTable (id, value) VALUES (‘2’, ‘Ahmet’);
INSERT INTO MyTable (id, value) VALUES (‘3’, ‘T2’);

Executing the same select query will result as follows:

id | value — — + — — — -3 | T22 | Ahmet1 | Mert(3 rows)

Since replication factor is 3, the data inserted in to one node will be replicated to the rest 2. Of course all these nodes should be deployed into separate hosts and a cluster through the client API should be built for access. If they reside in the same host (say boot2docker), there’d be no feasible way to access 3 nodes as a cluster through the client.

Mert Çalışkan

Opsgenie Champion at Atlassian. Oracle Java Champion. AnkaraJUG Lead. Author of Beginning Spring & PrimeFaces Cookbook.