Quick Hive Commands and Tricks

Mohamed Camara
4 min readApr 15, 2019

--

While working in hive the need will arise to enable or switch certain features, obtain additional details about your schema or perform certain tasks directly from your hive command line. Here are some examples of commands which you can use in such circumstances.

TABLES

SHOW CREATE TABLE table_name
This command will print the create table DDL statement to the console along with additional information such as the location of your table. This command can be very useful during migration or when creating Hive Tables from one cluster to another using a script.

DESCRIBE FORMATTED table_name
Prints table details to console but in a more formatted way compared to DESCRIBE EXTENDED table_name. It also provides additional detail than the SHOW create table command into two sections:

#Detailed Table information: database details, user details (user who created the table), table type.

#Storage Information: Input format, bucket columns if any, number of buckets, partition columns.

The DESCRIBE EXTENDED table_Name can be altered to include a partition key. In this case, it shows directory location for given partitioned value. PARTITION(partitioned_column=partition_value) displays the actual directory for the partitions.

Hive CLI

set hive.cli.print.header=true
Prints column names in output to console

set hive.metastore.warehouse.dir
You can use this command if you wanted to know the directory of your hive warehouse.

set hive.execution.engine
Displays the execution engine of hive. The execution engine type could be Tez or Mapreduce for example. To switch from one to the other simply make change to the value after the equal sign

Query Execution using TEZ: set hive.execution.engine=tez

Query Execution using MR: set hive.execution.engine=mr

set hive.cli.print.current.db=true
Displays current database name to console as such:

before : hive >

after : hive(database_name) >

source
Allows for the execution of a script directly from Hive CLI. The script must be of file type ‘.hql’ or ‘.q’.

Contents of the script, fileone.hql

create table new_customer_details (
phone_number string,
plan string,
rec_date string,
status string,
balance string,
imei string,
region string
)
clustered by (plan) into 5 buckets
row format delimited fields terminated by ‘,’
lines terminated by ‘\n’;

It is also possible to interact directly with HDFS from the hive cli. To interact directly with HDFS the following command can be used : dfs -args

dfs -ls /path
dfs -cat /path/file_name

Shell

hive -e
This command allows to execute queries from outside the hive cli.

To get rid of the extra lines in the output such as “Logging initialized using…Time taken”, an -S can be added: hive -S -e ‘Query;’. This is known as Silent Mode.

There is also a Verbose Mode which will show executed SQL to console in the output.

hive -v -e “query;”

An alternative to specifying “use database_name” in the query is to write database_name.table_name as seen below.

hive -f
This is the equivalent of the source command that can be run in the hive cli to run scripts. It used as such: hive -f <filepath/filename>. Just like the source command, the file type must be a “.hql” or “.q” format.

In this case I am in the current directory where the file resides hence no need to reference file path

hive -v -f file_name.hql

--

--

Mohamed Camara

Big Data Fanatic. Interested in everything Data Engineering and Programming.