The other day while working with glusterfs, I found something strange. The I/O was happening properly and all the brick processes were also up in the nodes, but on trying to obtain the status of the volume, I found that certain bricks are in stopped state! Strange isn’t it?
We’ll explore as to what exactly caused this and why not to mess with the ye old sock files.
So, I started digging deep ( well a fancy term for going through the logs. If it weren’t for the logs debugging would have been an herculean task for any issue in any project! ). Now, the brick processes are up and the I/O is happening as usual with no hiccups…so who is to blame?
Before jumping on to the log analysis (fancy term for grep of logs), I had actually tried the usual
# gluster vol start <volume-name> force
# systemctl restart glusterd
So, the brick processes are running and so is glusterd but the glusterd still thinks that the brick process is not up..
After sometime ( banging the head on the logs…maybe we should have some log analysis on glusterfs ? rather than you know manually going through it…) I do see something peculiar in the glusterd.log.
- The glusterd process had found the brick process to be already running ( yes because I didn’t kill it )
- It takes in the sock file present in /var/run/gluster, specific to the brick process and tries to establish the connection.
- There’s a message wherein the Status for the brick is set to Stopped!
So, naturally the next step was to check what’s wrong with the sock file because that’s what is being used for communication.
And lo behold…the sock file is missing ( Now that is something I don’t know how…or why…its just that it was missing… maybe some rogue script is to blame here…). So that is why the status was being set to stopped, as the glusterd couldn’t connect to the brick process.
I’ve raised an issue upstream and am thinking of putting a sanity check on the sock file path being taken up for granted by the glusterd.
We should probably check it out if it exists as we have our fair share of orphans and zombies in nix, nobody requires a ghost now :p