Installing Hadoop on Ubuntu 20.04
Below I wrap up how to installing process. This is good for experimental NOT production at all.
What you need to do
- Install Java
- Download Hadoop
- Set environment
- Edit Hadoop XML
If success you will see
- localhost:8088 → See Hadoop icon screen
- localhost:9870 → See cluster status screen
Update and search for the new JDK.
If you are not familiar with Java, ignore its term we only need JDK.
sudo apt update
sudo apt-cache search openjdk
Latest LTS is 11 so I will install 11
sudo apt install openjdk-11-jdkjava -version
Visit link below. In the command line you will need
wget <link> to download it. Extract it to your home directory.
Choose the newer version. Here is
3.3.1 then choose the
We suggest the following site for your download: https://dlcdn.apache.org/hadoop/common/ Alternate download locations…
Setting the variable for Hadoop and also path for convenient calling of Hadoop command in
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64/
At this point you must be able to call the following binary from anywhere
Edit Hadoop XML and Start
I think Hadoop page is good already. Link below.
Some quick overview will make reading easier
- Hadoop will
sshto localhost so you will need to setup SSH key
- You need pseudo distribution mode
- Copy paste XML from Hadoop guide
- Start DFS
- Start YARN
Hadoop: Setting up a Single Node Cluster.
This document describes how to set up and configure a single-node Hadoop installation so that you can quickly perform…
Common error: JAVE_HOME not found
JAVA_HOME need to be set in
Common error 2: Cannot start YARN
resourcemanager is running as process 48888. Stop it first and ensure /tmp/hadoop-hadoop-resourcemanager.pid file is empty before retry
Resource manager still running despite
stop-dfs.sh so you need to stop ALL
Just leave the process like that seem like we do not need to run it with
service as we usually do.
Check the status page
Finally you must see the result like below
Some tips if you deploy it on the server.
ssh to forward it down to localhost then open it with your browser.
ssh -L 9870:localhost:9870 -nNT ubuntu@<your-server-ip>ssh -L 8088:localhost:8088 -nNT ubuntu@<your-server-ip>
Hope this helps !