How to Install Hadoop(2.7.3) on Ubuntu 14.04

  1. Sudo apt-get update “updates the list of available package & their versions”
  2. Sudo apt-get upgrde “install newer version of the package you have”
  3. “to install Oracle Java 8"
  • Sudo add-apt-repository ppa:webupd8team/java
  • Sudo apt-get update
  • Sudo apt-get install oracle-java8-installer
  • Java -version “if it’s installed correctly it will show you the version of Java you installed”

4. “to add hadoop user”

  • Sudo addgroup hadoop
  • Sudo adduser - -ingroup hadoop hduser “fill the details,it’s optional though”
  • “now open another terminal in that”
  1. Sudo su root
  2. Cd
  3. Sudo gedit /etc/sudoers
  4. (a new text file will open in that write below code below the user privilege specification)

Hduser ALL=(ALL:ALL) ALL

“now save the text file and close the new terminal & text file”

5.”to install & configure openssh”

  • Sudo apt-get install openssh-server
  • Sudo su hduser
  • Cd
  • Ssh-keygen -t rsa -P “” “if asked for file name just press enter”
  • Cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

6. Sudo gedit /etc/sysctl.conf

(put the below code in the text file)

#disable ipv6

net.ipv6.conf.all.disable_ipv6

l.disable_ipv6 = 1

net.ipv6.conf.default.disable_ipv6 = 1

net.ipv6.conf.lo.disable_ipv6 = 1

(save & close the text file)

  • Sudo reboot

7. After reboot is done you have to download the latest version of hadoop from official website of hadoop

8. After download is done copy that file to desktop & extract it there. Open terminal

  • Sudo su hduser
  • Cd
  • Sudo mv ‘ location of your extracted folder' /usr/local/hadoop
  • Sudo chown hduser:hadoop -R /usr/local/hadoop
  • Sudo mkdir -p /usr/local/hadoop_tmp/hdfs/namenode
  • Sudo mkdir -p /usr/local/hadoop_tmp/hdfs/datanode
  • Sudo chown hduser:hadoop -R /usr/local/hadoop_tmp/
  • Sudo gedit .bashrc
  • “Append following code at the end of the file”

#-- HADOOP ENVIRONMENT VARIABLES START --#
export JAVA_HOME=/usr/lib/jvm/java-8-oracle
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export PATH=$PATH:/usr/local/hadoop/bin
#-- HADOOP ENVIRONMENT VARIABLES END --#

11. Cd /usr/local/hadoop/etc/hadoop

12. sudo gedit hadoop-env.sh

(make changes to the Java implementation to use like below)

#export JAVA_HOME=${JAVA_HOME}

export JAVA_HOME=’/usr/lib/jvm/java-8-oracle’

(Save & close the file)

13. Sudo gedit core-site.xml

(write following code in configuration tag of xml file)

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

14. Sudo gedit hdfs-site.xml

(Write following code into the configuration tag)

<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>
</property>

15. Sudo gedit yarn-site.xml

(write following code into the configuration tag)

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.shuffleHandler</value>
</property>

16. Cp /usr/local/hadoop/etc/hadoop/mapred-site.xml.template /usr/local/hadoop/etc/hadoop/mapred-site.xml

17. Sudo gedit mapred-site.xml

(write following code into the configuration tag)

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

18. Cd

19. “now to run the hadoop”

  • Hdfs namenode -format ”if you see pile of written code coming on screen then it is set correctly”
  • Cd /usr/local/hadoop
  • Start-dfs.sh”it should ask you to you want to continue or not”
  • Start-yarn.sh “to start mapreduce services”
  • Now to see if hadoop started correctly or not use this command ‘Jsp’ “it should give you all the values to datanode namenode nodemanager jsp resource manager & secondary namenode”
  • If you see all this hadoop is started correctly
  • Now go to your browser & in url type 'loaclhost:8088' it will get you to the cluster page of hadoop where you can monitor your hadoop resources
  • Open a new browser tab type in url 'localhost:50070' it will get you to the namenode of your hadoop system

And like this installation of hadoop is done.

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.