Integrating LVM with hadoop and providing elasticity to datanode storage on AWS

We need to start by launching an aws ec2 instance in which we are going to carry out our practical.

After that , configure the hadoop setup on the aws by downloading the java file and hadoop rpm files , and installing them. You also need to make required changes to the hdfs-site.xml and core-site.xml. These are the prerequisites.

You also need to add two storage volumes to the ec2 instance of desired size, I have attached one of size 3GB and other one of 5 GB.

Let’s start with the practical:

Here I have , two hard disk attached with my system /dev/xvdf of 3Gb and /dev/xvdg of 5Gb fdisk -l will list hard disk attached to our system

Now we will create physical volume of /dev/xvdf and /dev/xvdg respectively …by using pvcreate command

pvcreate /dev/xvdf command will create physical volume of /dev/xvdf

pvdisplay /dev/xvdf command will display the physical volume

Now we will create volume group which will combine both the physical volume in one group i.e. we will get in total 5+3 =8Gb in volume group.

vgcreate volume_group_1 /dev/xvdf /dev/xvdg : This command will create volume group of name volume_group_1 and space 8 Gb combining space of /dev/xvdf and /dev/xvdg

Now we have total of 8Gb in our volume group and we want to create logical volume of 6Gb

lvcreate - - size 6G — name volume_group_1 This command will create logical volume of size 6Gb and name Lvol1 and this logical volume will be over volume group volume_group_1.

Now for using this storage we need to format and mount the storage to our datanode directory making it flexible to use.

mkfs.ext4 /dev/volume_group_1/Lvol1 will format the logical volume

Then we will configure hadoop datanode to share our storage of 6GB

mount /dev/volume_group_1/Lvol1 /datanodes command will provide storage to our datanode .

df -hT command will show details of our file system and we can see 6Gb is provided to our datanode directory.

Now we have to extend size of our storage from 6Gb to 7Gb lvextend — size +1G /dev/volume_group_1

This will extend size of our logical volume by 1Gb

We can observe the change in our logical volume but still df-hT command of filesystem shows same 6Gb storage

because we haven’t yet formatted the extended part so we need to format the extended part by using resize2fs /dev/volume_group_1/Lvol1 command will format the extended part and now we can see the extended size of our datanode directory.

Note : Similarly by using lvreduce command and resize2fs we can reduce the size of our logical volume.

Thanks for reading !

Hope you find this helpful.