Guide to install and run Hive 3.1.2 on Windows 10
--
Hive is a database which runs on top of Hadoop and provides functionalities like data warehouse and data analysis. It provides an SQL-like interface to interact with the databases. In this article I’ve compiled the steps to install and run Hive with Hadoop on Windows 10
1. Pre-requisites:
Install Hadoop by following this guide: https://medium.com/republic-of-coders-india/guide-to-install-and-run-hadoop-on-windows-a0b64fe447b6
Download Apache Derby Binaries:
Hive requires a relational database like Apache Derby to create a Metastore and store all metadata
Download the derby tar file from the following link:
https://downloads.apache.org//db/derby/db-derby-10.14.2.0/db-derby-10.14.2.0-bin.tar.gz
Extract it to the location where you have installed Hadoop
2. Download Hive binaries:
Download Hive binaries from the following link:
https://downloads.apache.org/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz
Extract it to the location where you have installed Hadoop
3. Setting up Environment variables:
Type ‘environment’ in Windows Search Bar
Click on Environment Variables
Click on New
Add the following variables:
HIVE_HOME: E:\hadoop-3.1.0\apache-hive-3.1.2-bin
DERBY_HOME: E:\hadoop-3.1.0\db-derby-10.14.2.0-bin
HIVE_LIB: E:\hadoop-3.1.0\apache-hive-3.1.2-bin\lib
HIVE_BIN: E:\hadoop-3.1.0\apache-hive-3.1.2-bin\bin
HADOOP_USER_CLASSPATH_FIRST: true
In Path Variable in User Variables add the following paths:
%HIVE_BIN%
%DERBY_HOME%\bin
Now in System Variables add the following:
HADOOP_USER_CLASSPATH_FIRST: true
4. Configuring Hive:
Copy Derby Libraries:
Copy all the jar files stored in Derby library files stored in:
E:\hadoop-3.1.0\db-derby-10.14.2.0-bin\lib
And paste them in Hive libraries directory:
E:\hadoop-3.1.0\apache-hive-3.1.2-bin\lib
5. Configuring Hive-site.xml:
Create a new file with the name hive-site.xml in E:\hadoop-3.1.0\apache-hive-3.1.2-bin\conf
Add the following lines in the file
<?xml version=”1.0"?>
<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>
<configuration><property> <name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property><property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.ClientDriver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<description>Enable user impersonation for HiveServer2</description>
<value>true</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>NONE</value>
<description> Client authentication types. NONE: no authentication check LDAP: LDAP/AD based authentication KERBEROS: Kerberos/GSSAPI authentication CUSTOM: Custom authentication provider (Use with property hive.server2.custom.authentication.class) </description>
</property>
<property>
<name>datanucleus.autoCreateTables</name>
<value>True</value>
</property>
<property>
<name>hive.server2.active.passive.ha.enable</name>
<value>true</value> # change false to true
</property>
</configuration>
6. Starting Services:
Start Hadoop Services:
Change the directory in terminal to the location where Hadoop is stored and give the following command:
start-all.cmd
Start Derby Network Server:
Start the Derby Network Server with the following command:
StartNetworkServer -h 0.0.0.0
Initialize Hive Metastore:
Give the following command to initialize Hive Metastore:
hive --service schematool -dbType derby -initSchema
Start Hive Server:
hive --service hiveserver2 start
Start Hive:
Start hive by giving the following command:
hive
Installation Done ✌