Best way to Export Hive table to a CSV file

This post explains different options to export Hive Table (ORC, Parquet, or Text) to CSV File.

Ganesh Chandrasekaran
Analytics Vidhya

--

Expected output: CSV File with comma delimiter and header

Method 1 :

hive -e 'select * from table_orc_data;' | sed 's/[[:space:]]\+/,/g' > ~/output.csv

Pros : Simple to use, output column header but default output is tab.
Cons : Need to convert Tab delimiter to ‘,’ which could be time consuming when exporting large file.

Method 2:

$ hadoop fs -cat hdfs://servername/user/hive/warehouse/databasename/table_csv_export_data/* > ~/output.csvCopy

Pros : Simple, with comma as delimiter in CSV output.
Cons : No column headers.

Method 3: (My personal favorite)

— Step 3a: Create a CSV table with a dummy header column as the first row.

CREATE TABLE table_csv_export_data
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED as textfile
AS
select
'id' as id
,'first_name' as first_name
,'last_name' as last_name
,'join_date' as join_date;

— Step 3b: Now insert the actual data into the table

INSERT INTO table_csv_export_data…

--

--

Ganesh Chandrasekaran
Analytics Vidhya

Big Data Solution Architect | Adjunct Professor. Thoughts and opinions are my own and don’t represent the companies I work for.