Installation of Hadoop 2.7.3
Pre-requisite for Hadoop :
Java
Install jdk-8 : sudo apt-get install openjdk-8-jdk
set java home : gedit /etc/environment
Put below line at end of file
JAVA_HOME=“/usr/lib/jvm/java-8-openjdk-amd64”
check : java -version
If it’s showing version number then you have installed successfully.
Ssh
Install ssh : sudo apt-get install ssh
Generate public key:
ssh-keygen -t rsa -P “”
Make generated public key authorized :
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
check : ssh localhost
Now, We’ll start installation of hadoop.
Step 1 : Download Hadoop
http://hadoop.apache.org/releases.html
Step 2 : Unpack the downloaded file.
tar -xvzf [path and name of file with extension]
step 3 : Move the folder ‘ hadoop-2.7.3’ to ‘home’ directory.
step 4 : Configure the hadoop variables to ‘bashrc’ file
open bashrc file : gedit ~/.bashrc
At the end paste the below lines :
# Start of Hadoop Variablesexport JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
export HADOOP_HOME=/home/lee/Project/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=“-Djava.library.path=$HADOOP_HOME/lib”#End of Hadoop Variables
Note : You have to check only 2 things here
1. Check JAVA_HOME.Whether it is same or not
2. Check HADOOP_HOME.Whether it is stored in same folder.
Do changes according to that.
Note : Run below command to save to hadoop variables in system.
source ~/.bashrc
step 5 : Make a directory where hadoop resides.
mkdir hadoop_store
Make a directory in hadoop_store.
mkdir hdfs
Make another 2 directory in hdfs.
mkdir namenode
mkdir datanode
Make a directory in hadoop-2.7.3 folder.
mkdir tmp
Step 6 : Modify Hadoop Config Files
We are going to modify following files:
hadoop-env.sh
hdfs-site.xml
core-site.xml
mapred-site.xml.template
Note : All file resides in hadoop-2.7.3/etc/hadoop
hadoop-env.sh :
open file : gedit hadoop-env.sh
Add below line to end of file :
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
hdfs-site.xml :
open file : gedit hdfs-site.xml
Add below line in between <configuration> </configuration> tag.
<property>
<name>dfs.replication</name>
<value>1</value>
</property><property>
<name>dfs.namenode.name.dir</name>
<value>file:######NAMENODE_FOLDER_PATH######</value>
</property><property>
<name>dfs.datanode.data.dir</name>
<value>file:######DATANODE_FOLDER_PATH######</value>
</property>
Note : Set your path of datanode and namenode in <value> tag.
In my case, It’s like this :
For namenode : <value>file:/home/lee/Project/hadoop_store/hdfs/namenode</value>
For datanode : <value>file:/home/lee/Project/hadoop_store/hdfs/datanode</value>
core-site.xml :
open file : gedit core-site.xml
Add below line in between <configuration> </configuration> tag.
<property>
<name>hadoop.tmp.dir</name>
<value>######TMP_FOLDER_PATH######</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
mapred-site.xml :
Before opening this file fire the below command :
cp mapred-site.xml.template mapred-site.xml
open file : gedit mapred-site.xml
Add below line in between <configuration> </configuration> tag.
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
Step 7 :
- format Hadoop Filesystem
Before we start Hadoop we need to format the Hadoop filesystem.
$hadoop namenode -format
- start hadoop
$start-all.sh
- Run jps to see running process
$jps
- Stop hadoop
$stop-all.sh
Visit localhost:50070 in browser.