Best way to Export Hive table to a CSV file
This post explains different options to export Hive Table (ORC, Parquet, or Text) to CSV File.
Published in
2 min readJun 9, 2018
Expected output: CSV File with comma delimiter and header
Method 1 :
hive -e 'select * from table_orc_data;' | sed 's/[[:space:]]\+/,/g' > ~/output.csv
Pros : Simple to use, output column header but default output is tab.
Cons : Need to convert Tab delimiter to ‘,’ which could be time consuming when exporting large file.
Method 2:
$ hadoop fs -cat hdfs://servername/user/hive/warehouse/databasename/table_csv_export_data/* > ~/output.csvCopy
Pros : Simple, with comma as delimiter in CSV output.
Cons : No column headers.
Method 3: (My personal favorite)
— Step 3a: Create a CSV table with a dummy header column as the first row.
CREATE TABLE table_csv_export_data
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED as textfile
AS
select
'id' as id
,'first_name' as first_name
,'last_name' as last_name
,'join_date' as join_date;
— Step 3b: Now insert the actual data into the table
INSERT INTO table_csv_export_data…