Adding/Replacing Cassandra Nodes: you might wanna cleanup!

Himanshu Agrawal
BYJU’S Exam Prep Engineering
3 min readMay 13, 2022

Nodetool, a must-have for those managing Cassandra in production!

Source: earthday.org

Well, if you are running a Cassandra cluster, you should certainly go through nodetool features. That is what helped me save lots of disk space.

Since you are managing a Cassandra cluster, there would have been times when you scaled up on the number of nodes in your cluster, or when you might have even replaced an old node with a new node. It's common, and we at BYJU’s Exam Prep had also been doing the same until we realized that our disk space is growing continuously and that we needed to scale up disk space as well.

Easy! I could go ahead and change the disk from 1 TB to 2 TB. I might end up with a few storage costs, but who cares!

Well, it turns out the bigger your disk size, the more time it takes to boot up when you restart your cluster. As the DB engine boots up, it has to gain control of all the SSTables and all the commit logs; so the data it has to operate on is huge and so it takes time.

Something else we noticed was that the older nodes have a huge amount of data whereas the new nodes had comparatively little data. So one might think of rebalancing the nodes!

Do you need to?

Here’s what we did:

nodetool cleanup

That's pretty much it! We just ran cleanup! It took somewhere around 18–20 hours for us, but the results were impressive.

When do you need it?

  • When we add a new node to the cluster, Cassandra moves a few partition ranges from each node to the new node. But it does not automatically remove the keys from the original node which are no longer in use. This created all the mess. I am sure the developers at Cassandra had significant reasons not to remove the discarded keys from the node; one I can think of is to prevent data loss in replicating the partitions.
  • This also happens when you decrease the replication factor, of course because partitions belonging to the replica that you just disowned are lying around on various nodes.
  • Another cause can be when you move the tokens, which might not be a normal scenario, so let us just skip this.

If you perform any of the above operations, you must run a cleanup.

How it worked for us

Here’s a graph of Disk Space Used by one of our Cassandra nodes. But first, how did we end up in such a scenario.
We had a 3 node cluster of 1 TB each & for some reason, we replaced two of the nodes with new ones.

The difference as you can see is huge, from 824GB to 504GB on the node that we didn’t replace. Well, it was expected, considering we replaced 2/3rd of the nodes.

Considerations before cleanup

The node you are about to run cleanup on will require excessive CPU resources in the process. Before running it in production, make sure to find a suitable time period with low traffic, or have sufficient nodes in the cluster to handle read/write ops and avoid downtime. Also, since you don’t know how long it’s gonna take, you can use a detachable session or run the command in background (tmux).
For us, our traffic drops significantly during night hours, so that was a plus point for us. Also under normal circumstances, our CPU usage was around 20–30% which could easily provide room for cleanup to run its operations.

--

--

Himanshu Agrawal
BYJU’S Exam Prep Engineering

Software Engineer, Exploring new tools, working on POCs, finding peace in optimising software solutions as well as debugging colleague's issues.