Hadoop Installation in Ubuntu 16.04
Install java 8
install openssh-server
Download the hadoop
adduser hadoop
vi /etc/hosts
Enter the ip and hostname of master and slave
eg:
192.168.1.x master
192.168.1.x slave-1
copy the link location of hadoop tar file which is already loaded in ftp
=====================================================
#wget ftp://192.168.1.15/bigdata/hadoop-3.2.0.tar.gz
#ls -l
#tar -xvzf hadoop-3.2.0.tar.gz
#mv hadoop-3.2.0 /opt/hadoop
#sudo mv hadoop-3.2.0 /opt/hadoop
#chown -R hadoop:hadoop /opt/hadoop
#cd /opt/hadoop
#cd
#vi .bashrc
=======================ADD this LINE===========================
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/jre
export HADOOP_HOME=/opt/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
#export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
#export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
#export HADOOP_OPTS=”-Djava.library.path=$HADOOP_INSTALL/lib/native”
Run this command to restart the bashrc file.
#source .bashrc
#vi /opt/hadoop/etc/hadoop/hadoop-env.sh
======================ADD this LINE=================================
export JAVA_HOME=/usr/lib/jvm/java-8-oracle/jre
#vi /opt/hadoop/etc/hadoop/core-site.xml
==============================ADD inbetween <configuration></configuration>====
<property>
<name>fs.default.name</name>
<value>hdfs://hadoop-master:9000</value>
</property>
#vi /opt/hadoop/etc/hadoop/hdfs-site.xml
=============ADD inbetween <configuration></configuration>===================
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/hadoop/hadoopdata/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/opt/hadoop/hadoopdata/hdfs/datanode</value>
</property>
</configuration>
#mkdir /opt/hadoop/hadoopdata/hdfs/namenode
#mkdir /opt/hadoop/hadoopdata/hdfs/datanode
#vi /opt/hadoop/etc/hadoop/mapred-site.xml
==========ADD inbetween<configuration></configuration>======================
<property
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
#vi /opt/hadoop/etc/hadoop/yarn-site.xml
=======ADD inbetween <configuration></configuration>=========================
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
#vi /opt/hadoop/etc/hadoop/workers
enter the slave hostname
#vi /opt/hadoop/etc/hadoop/copyconfig.sh
for node in `cat workers`; do
scp * $node:/opt/hadoop/etc/hadoop/;
done
#vi /opt/hadoop/bin
#hadoop namenode -format
#cd ..
#sbin/start-dfs.sh
#sbin/start-yarn.sh
#apt-get install openssh-server
#ssh-keygen -t rsa -P “”
#cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
#chmod 600 $HOME/.ssh/authorized_keys
#ssh localhost
copy the public key to slave host
#ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@hadoop-slave-1
go to directory of hadoop sbin
#vi /opt/hadoop/bin
#hadoop namenode -format
#cd ..
#sbin/start-dfs.sh
#sbin/start-yarn.sh
After that finally give command jps to list the services running on the hadoop environment
$jps
namenode
resourcemanager
secondarynamenode
==================IN SLAVE================
vi /etc/hosts
Enter the ip and hostname of master and slave
eg:
192.168.1.41 master
192.168.1.43 slave-1
Configure the same on the slave node with out editing the workers file in
vi /opt/hadoop/etc/hadoop/workers
Remaining configuration do the same for all.
************Dont restart any service in hadoop of slave***************
Restart the services on master of hadoop
IN MASTER
==========
cd /opt/hadoop/
sbin/start-all.sh
Then check on the slave node by giving
$ jps
NodeManager
Jps
DataNode