Spinning up a Cassandra Cluster on Google Cloud (for free) with just a browser

Sreekar Reddy
4 min readOct 13, 2016

--

Installing and starting Cassandra or Hadoop like distributed software on your desktop is simple, just run whatever startup script they provided. Pretty much works. But when you want to run a cluster on your desktop, you have multiple complex options where things always go wrong and make you cry.

Rather than learning big data, you learn about how to use bridged networking on Virtual Box. Instead I have helped people spin up a cluster (Cassandra) on Google cloud in less than 20 minutes. The reason is the networking is handled for you and they don’t need you to learn about SSH-ing and other stuff (which you eventually learn)

Google cloud is a cloud platform where you can spin up servers and install your software on them. It provides a lot of other advanced services but they are not of interest to us here. The feature we use is called “Compute Engine”.

So sign up for Google cloud for 60-day trial and login.

First screen you will see after login. Click “GO TO CONSOLE”

Now you are in console screen. Console is your main screen from where you have access to all the features of Google cloud.

Click “Compute Engine” where we can create our own servers

Compute Engine is the feature we want to use to create our own server which is called as “VM (Virtual Machine)” or“VM Instance” (For AWS folks, it is EC2). Now create instance and select the following options.

Enter your server properties here

You can use the following configuration for reference:

  1. Name: Any name
  2. Zone: Any zone is fine, optimally select the one that is near to you
  3. Machine type: select RAM and CPUs( number of cores)
  4. Boot Disk (OS you want in your server) — select Cent OS 7

You can leave the rest. Do this with one more server. It takes some time to spin up two machines for Google cloud.

You should see two machines created with a green symbol beside them

Now is the time to connect to your servers (which are in cloud) and install Cassandra and make a cluster. Follow this to connect to a command-line terminal to your just created Linux machines.

This access is called SSH. The commands you enter in this window will affect your server

Once you have that, download Cassandra and it’s required software using the following steps. ( Note: Do it on both servers)

Get Java 8: sudo yum install java-1.8.0-openjdk.x86_64
Get a program called wget to download Cassandra: sudo yum install wget
Get Cassandra: wget http://mirrors.ibiblio.org/apache/cassandra/3.9/apache-cassandra-3.9-bin.tar.gz
Get some required tools: sudo yum install net-tools

You can start Cassandra at this point on both servers and use them normally. But to make cluster we need to change a few lines in configuration file called cassandra.yaml

Unzip cassandra : tar -xzvf apache-cassandra-3.9-bin.tar.gz
cd apache-cassandra-3.9
cd conf

Find the IP address of your machine with “ifconfig” command. Copy that.
Open cassandra.yaml in VIM and change the following things:

seeds: IP_ADDRESS (Replace localhost,;leave everything else intact)
listen_address: IP_ADDRESS
start_rpc: true
rpc_address: IP_ADDRESS

Start Cassandra on one server, using the following command:

cd ~/apache-cassandra-3.9/bin
./cassandra

Now, on other machine, we need one more change in the config file to point to this server. That is seeds configuration.

seeds: IP_ADDRESS of previously started server (previously we replaced the localhost with server IP)

Now start Cassandra like you did on the other server.

You can check the status of the cluster, using nodetool which is in the bin folder of Cassandra. It should show you two servers in the cluster. Try some CQL commands and see them being applied on both machines.

nodetool status

With VirtualBox, you need to properly configure the network so that these IPs work together, it’s hard and requires some networking knowledge. With Google cloud, I found it’s pretty easy because everything can be done from browser.

--

--