How to deploy Cassandra and connect on Google Cloud Platform with a few clicks
Over the past year, I’ve spent a lot of time working with Cloud Bigtable on Google Cloud Platform (GCP.) Lately, I’ve wanted to get a better understanding of similar big data offerings on the market specifically HBase and Cassandra.
I‘d never set up anything like Cassandra before and was intimidated by all the tutorial videos I watched, but it proved to be a very manageable task using GCP’s click to deploys.
On your GCP project with billing enabled, you can navigate to the Cloud Marketplace and search for “Cassandra.”
There are a few options that come up, but I chose the Cassandra (Google Click to Deploy) version that runs on Google Compute Engine.
Click “Launch on Compute Engine” and you’ll be brought to a deployment configuration form. You can leave all the defaults or modify them if you are more familiar with Cassandra. The only thing I changed was enabling Stackdriver logging and monitoring in case I would need them for debugging. Once you’re happy with the configuration, submit the form to deploy your Cassandra cluster!
After a few minutes your cluster will be deployed and you can connect.
Click the SSH button on the right hand pane to open a shell that will connect to your cluster. In the shell you can access the Cassandra command-line client by typing
cqlsh. You can create a keyspace, table, add some data and query it all through that shell. I followed this Hello World and found it pretty helpful.
I also wanted to connect to my Cassandra cluster with one of the client drivers to see how I would use it in a real application. You’ll need to create a new firewall rule that allows access for TCP over port 9042, so the client can talk to the cluster. You can use this gcloud command:
gcloud compute firewall-rules create cassandra-client --allow tcp:9042
or create the firewall in the GCP console under VPC network.
Add a firewall rule that targets all instances in the network, filters 0.0.0.0/0 as the range of IP addresses, and allows TCP:9042. If you’re using this for production purposes, you should specify a more controlled firewall rule.
Now that your firewall is set up, you can connect with the client of your choice. I decided to use the Datastax Java Driver, and performed a quick query to see if my connection succeeded. I used the external IP of the first VM (which can be found under Compute Engine in VM Instances) and the region (the zone without the letter suffix) as the name of the local datacenter.
And I was happy to see that my query worked!