The post has been updated to install the latest version of Cloudera Manager 6.3.1
Part 1
- Creating a cluster with 4 nodes on GCP
- Installing JAVA
- Firewall Configurations
- Installing Cloudera Manager
- Cloudera Manager — Cluster Installation
- Cloudera Manager — Cluster Configuration
Assumptions:
- You have a Google Cloud account. If not, click here to create a free-tier Google Cloud account. This will give you USD 300 of free credit
- Manual installation of Cloudera Manager without Google’s Dataproc functionality
Creating a cluster with 4 nodes on GCP
Once you create a Google Cloud account. Navigate to the console and hit the drop-down for “Select a project”
Now on the top-right, hit “NEW PROJECT”. Add a “project name” and click save. Leave “organization” as-is
From the navigation menu on the left, select Compute Engine -> VM Instances as shown
“Create” a new VM Instance
Add a generic name for the instance. I generally do instance-1 or instance-001 and continue the numbers consecutively
Select “us-central1 (Iowa)” region with the “us-central1-a” zone. This seems to be the cheapest option available
The “n1” series of general-purpose machine type is the cheapest option
Under machine type, select “Custom” with 2 cores of vCPU and 12 GB of RAM. Please note there is a limit to the number of cores and total RAMs provided under the free-tier usage policy
Under “Boot disk”, select Centos OS 7 as the OS and 100 GB as storage
Under Identity and API access, leave the access scopes as-is
Under Firewall, select both boxes to enable HTTP and HTTPS traffic
Repeat the steps above to create 4 nodes each with the same configuration
In SSH drop-down, select “open in browser window”. Repeat for all nodes. Enter the commands:
sudo su -
vi /etc/selinux/config
Inside the config file, change SELINUX=disabled
vi /etc/ssh/sshd_config
Under Authentication, change
PermitRootLogin yes
Now we can login into instance-2/3/4 from instance-1 without password
Ensure that you’ve done the above steps on all nodes. Following which you should reboot all the 4 nodes
Re-login into instance-1 as root user and enter:
ssh-keygen
hit enter three times
and your keys will be generated under /root/.ssh/
In instance-1, as root user:
cd /root/.ssh
cat id_rsa.pub
And copy the public key
In cloud console menu, metadata -> sshkeys -> edit -> add item -> enter key and save
Now, in the terminal, on all nodes:
service sshd restart
From instance-1:
ssh instance-2
“yes” to establish connection
Repeat for instance-3 and 4
Cluster setup is completed for 4 nodes on Google Cloud Platform
Installing JAVA
In order to install Java, please visit this link. The above link will allow you to download and install Java on instance-1
Lets install it on the other nodes now:
Copying the jdk…rpm to the other nodes
scp jdk….rpm instance-2:/tmp
scp jdk….rpm instance-3:/tmp
scp jdk….rpm instance-4:/tmp
Lets navigate to instance-2 and run the following commands:
ssh instance-2
cd /tmp
rpm -ivh jdk….rpm
Repeat the same steps on instance-3/4
Java is installed on all 4 nodes on Google Cloud Platform
Installing Cloudera Manager
Head over to Cloudera Manager Downloads page, enter your details and hit download. Copy this link
To install CM, change permissions and run the installer.bin
wget <and paste it here>
chmod u+x cloudera-manager-installer.bin
sudo ./cloudera-manager-installer.bin
This window will now open, hit Next and accept all licenses
This launches the Cloudera Manager Login Page. Use admin/admin as credentials
Here’s the Cloudera Manager Homepage
Accept the licenses
There are other options, but this 60 day Enterprise-trial period seems to be the best option
Firewall Configurations
Please review the link to configure the required firewall configurations
This completes part 1 of the Cloudera Manager Installation on Google Cloud Platform. For part 2, please visit this link