Umesh Jadhav
2 min readSep 16, 2016
Image source : http://www.kdnuggets.com/wp-content/uploads/big-data-visualization.jpg

Installation of Hadoop 2.7.3

Pre-requisite for Hadoop :

Java

Install jdk-8 : sudo apt-get install openjdk-8-jdk
set java home : gedit /etc/environment
Put below line at end of file
JAVA_HOME=“/usr/lib/jvm/java-8-openjdk-amd64”
check : java -version
If it’s showing version number then you have installed successfully.

Ssh

Install ssh : sudo apt-get install ssh
Generate public key:

ssh-keygen -t rsa -P “”

Make generated public key authorized :

cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
check : ssh localhost

Now, We’ll start installation of hadoop.

Step 1 : Download Hadoop
http://hadoop.apache.org/releases.html

Step 2 : Unpack the downloaded file.

tar -xvzf [path and name of file with extension]

step 3 : Move the folder ‘ hadoop-2.7.3’ to ‘home’ directory.

step 4 : Configure the hadoop variables to ‘bashrc’ file

open bashrc file : gedit ~/.bashrc

At the end paste the below lines :

# Start of Hadoop Variables

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/
export HADOOP_HOME=/home/lee/Project/hadoop-2.7.3
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=“-Djava.library.path=$HADOOP_HOME/lib”

#End of Hadoop Variables

Note : You have to check only 2 things here
1. Check JAVA_HOME.Whether it is same or not
2. Check HADOOP_HOME.Whether it is stored in same folder.
Do changes according to that.

Note : Run below command to save to hadoop variables in system.
source ~/.bashrc

step 5 : Make a directory where hadoop resides.

mkdir hadoop_store

Make a directory in hadoop_store.

mkdir hdfs

Make another 2 directory in hdfs.

mkdir namenode
mkdir datanode

Make a directory in hadoop-2.7.3 folder.

mkdir tmp

Step 6 : Modify Hadoop Config Files

We are going to modify following files:

hadoop-env.sh

hdfs-site.xml

core-site.xml

mapred-site.xml.template

Note : All file resides in hadoop-2.7.3/etc/hadoop

hadoop-env.sh :

open file : gedit hadoop-env.sh

Add below line to end of file :

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/

hdfs-site.xml :

open file : gedit hdfs-site.xml

Add below line in between <configuration> </configuration> tag.

<property>
<name>dfs.replication</name>
<value>1</value>
</property><property>
<name>dfs.namenode.name.dir</name>
<value>file:######NAMENODE_FOLDER_PATH######</value>
</property><property>
<name>dfs.datanode.data.dir</name>
<value>file:######DATANODE_FOLDER_PATH######</value>
</property>

Note : Set your path of datanode and namenode in <value> tag.
In my case, It’s like this :
For namenode : <value>file:/home/lee/Project/hadoop_store/hdfs/namenode</value>
For datanode : <value>file:/home/lee/Project/hadoop_store/hdfs/datanode</value>

core-site.xml :

open file : gedit core-site.xml

Add below line in between <configuration> </configuration> tag.

<property>
<name>hadoop.tmp.dir</name>
<value>######TMP_FOLDER_PATH######</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>

mapred-site.xml :

Before opening this file fire the below command :

cp mapred-site.xml.template mapred-site.xml

open file : gedit mapred-site.xml

Add below line in between <configuration> </configuration> tag.

<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>

Step 7 :

  • format Hadoop Filesystem

Before we start Hadoop we need to format the Hadoop filesystem.

$hadoop namenode -format

  • start hadoop

$start-all.sh

  • Run jps to see running process

$jps

  • Stop hadoop

$stop-all.sh

Visit localhost:50070 in browser.

Hadoop Installtion completes Here :)