Loading CSV Into Hbase Table In Kerberized Hadoop Cluster

Deepesh Tripathi
2 min readJun 29, 2020

--

pic credit hbase.apache.org

Looking for some quick step by step method to load the bulk of data into Hbase table in a Kerberos enabled Hadoop cluster?, Well this space is perfect for you to know all the steps to load a CSV data file into “kerberized” Hadoop cluster.

To test HDFS encryption, recently I had the requirement to pump some data in one go into the Hbase table which I have documented for anyone looking for the same, so let’s get started without wasting much time.

Copy CSV Data To HDFS Filesystem

First of all, just download the CSV data file from some external or internal data source and copy this file into the HDFS filesystem from where Hbase can read the file to load the data into the Hbase table.

hdfs dfs -copyFromLocal emp_data.csv /user/hbase/opsuser/data

Create Hbase Table

First, log in with “hbase” user.

Connect With Hbase Service Principal(For Kerberos Authentication)

kinit -k -t /etc/security/keytabs/hbase.service.keytab hbase-kerbodemo@EXAMPLE.COM

Start Hbase-Shell

hbase shell

Create Hbase Table

create 'emp_data',{NAME => 'cf'}

Load data Into Hbase Table

For loading the CSV data into Hbase table we have to use the MR job which will load the data from CSV file into the Hbase table.

cd /usr/hdp/2.6.3.0-235/hbase/binhbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=',' -Dimporttsv.columns='HBASE_ROW_KEY,cf:ename,cf:designation,cf:manager,cf:hire_date,cf:sal,cf:deptno' emp_data /user/hbase/opsuser/data/emp_data.csv
MR job Succeeded

Scan the Hbase Table

scan 'emp_data'

Here you go, All data successfully loaded into Hbase Table.

--

--

Deepesh Tripathi

A Tech Geek | Meditator | Writer | Vedic Scholar | Speaker who loves to write about technology and humanity for personal and professional transformation.