Hadoop Namenode Override issue

Ramprakash
Analytics Vidhya
Published in
3 min readAug 26, 2020

Recently, I worked on setting up hadoop cluster on two nodes. I developed a bash script to automate this hadoop installation on both master and slave machines. The script will do all the required steps and will configure all the files (core-site.xml, hdfs-site.xml, mapred-site.xml, yarn-site.xml etc..,) appropriately.

I ran that Installer script and everything worked fine. Hadoop cluster with two datanodes is settled up.

Next day, due to my mistake I messed up everything. What I did is , I ran the same Installer script again on the Namenode machine which in turn overrides all the configuration setup and everything on master node. Even namenode got formatted. Luckily, I don’t have any data in that cluster so no issues.

After it got overridden, I tried to start all the hadoop services again on all the nodes using this command:

$HOME/hadoop/sbin/start-all.sh

Verified with below command to check whether it started all the services in namenode or not :

jps
namenode running services

When I checked in datanode, datanode service was not started instead only nodemanager service is running:

datanode running services

I tried again stopping and starting all the hadoop services from namenode, but the result is same. Datanode service is not getting started in slave node. Even cluster was showing it has only one datanode.

I am not sure what to do as I faced this issue first time. Then I tried to check the log files in slave node, It shown an error like Incompatible cluster ID’s in both machines.

it shows an error like this:

log file in data node

It means, When I first started hadoop cluster there was an ID generated for that cluster and it populated in both namenode and datanode. But when I did re-install on namenode , clusterID got updated in namenode but not on datanode which in turn causing this incompatible cluster ID issue.

So, what I did is , I manually copied clusterID value from namenode and updated the same value and then I started hadoop services in datanode.

$HOME/hadoop/sbin/hadoop-daemon.sh start datanode

Now, its working fine and cluster is showing with two nodes :

So, If you face an issue like namenode not detecting datanode. First check whether clusterID is same on both the machines or not. If not update and try to start datanode again.

Happy ending !…….

--

--