Troubleshooting for Installation of Apache Hive with Docker on Windows 10 Home

Hongri Jia
Passion for Data Science
4 min readApr 15, 2018

Apache Hive is a data warehouse software project based on Apache Hadoop,supporting data summarization, query and analysis for large data sets. In this blog, I will talk about the solutions for several issues you may face while installing Docker and working with Apache Hive within Docker containers on Windows 10 Home.

Issues during the Docker Installation for Windows

Generally, you can download Docker for Windows installers on the official website. Actually, the current version of Docker for Windows requires 64-bit Windows 10 Pro, Enterprise or Education with Hyper-V available. However, most of the Windows users only have Windows 10 Home. Docker installer will tell you the installation failed since your system don’t meet the requirement.

Figure-1 Installation of Docker for Windows failed

If this is your case, you have to install Docker Toolbox instead, which uses Oracle Virtual Box instead of Hyper-V.

Figure-2 Docker Toolbox for Windows Installer

During the installation, you should make sure you have chosen the VirtualBox in the Select Components page if you didn’t install the Oracle Virtual Box before. Docker does not start automatically after installation. You can double click the Docker Quickstart Terminal shortcut on the desktop to start it. Usually, Docker will run the configuration process automatically and can be used after that.

Figure-3 Command Prompt of Docker for Windows

If you get some error after starting the Docker Toolbox and cannot finish the configuration, it might because you already have the Oracle Virtual Box before. You have to uninstall both the Virtual Box and Docker Toolbox and redo the installation.

Insufficient Memory Issue during Running Hive in Docker

After you can see the cute whale in Docker Toolbox, next step is to install Apache Hadoop and Hive in the Docker container. There are many tutorials about this, so I won’t show the installation process here. I will give the solution for a common problem, which is the out of memory error.

Sometimes, even though you finish the entire installation and configuration process properly, you still get a out of memory error message when you run some complex jobs in Hive. This means the memory you allocate to the Docker container is insufficient to finish the work. At this time, you need to update the Hadoop Java options manually.

First, use the following command in the Docker container to get into the hadoop environment configuration file.

Then add the content in the red rectangle into the file and save it.

In most cases, the problem will be solved and you can restart the Hive to run your jobs. However, sometimes you might get another error which tells you it cannot allocate the memory. This is because the memory allocated to the virtual machine for Docker is not sufficient when you create it in Oracle Virtual Box. Thus, you are supposed to open the Virtual Box Manager and change the setting manually.

Figure-4 Virtual Box Manager

In the Virtual Box Manager, you can find the running virtual machine for Docker. You have to power off it if you want to change the setting. A very important hint here is that you will lost all of your Docker containers when you power off the virtual machine. So you must make sure you won’t lose any important stuff before the powering off.

After shutting down the virtual machine, you should go to Setting -> System and change the base memory from 1024 MB (default value) to at least 2048 MB in the Motherboard tab.

Figure-5 Motherboard Setting Interface

Then click the Processor tab and increase the process number from 1 to at least 2.

Figure-6 Processor Setting Interface

When you complete the setting changes, save them and restart the virtual machine. You have to create a new Docker container and install the Hadoop and Hive again. Don’t forget to update the Hadoop Java options, which is mentioned above. You will work with Hive in Docker smoothly after all these steps are done correctly.

If you are interested in my work or have some problems about it, please feel free to contact me. At the meantime, if you want to learn more about the big data technologies, check out the website of WeCloudData.

--

--