BeeJU: Hive Metastore testing simplified

Hotels.com Engineering
The Hotels.com Technology Blog
2 min readNov 30, 2017
Beekeeper keeping bees by Michael Gäbler licensed under CC by 3.0

In the Hotels.com Big Data Platform team we frequently need to integrate with the Apache Hive Metastore. We’ve even written tools to help us interact with it like Waggle Dance. Waggle Dance and many ETL jobs manipulate the Metastore API using Thrift calls to update metadata relating to our datasets. The Hive Thrift API is a third party dependency in those projects. The API is extensive and often needs a little trial and error to figure out how to write the correct API call. This makes it hard to mock and our integrations hard to test.

We love automated testing (see our testing framework for Cascading: Plunger). We build out our automated test suites with JUnit. However, testing Hive Metastore API integrations is non-trivial. To make this easier we wrote BeeJU.

BeeJU is a small project containing a couple of JUnit Rules; simple to use resources that spin up a Thrift Hive Metastore server that can be injected in your code for testing. You don’t need to mock the API, the rules will behave just like the real thing. Of course mocking is much faster and spinning up and tearing down a rule adds a bit of overhead (we’ve found this to be in the order of seconds). This overhead is acceptable as it greatly reduces the amount of debugging that takes place in our test environments. Automating the tests allows us to catch regression errors and JUnit test suites provide documentation of the code.

BeeJU also benefits those wishing to experiment with the Hive Metastore; developers can quickly test the code locally without having to configure and spin up a Metastore instance. They can experiment with different API calls which leads to better code, faster. It’s a way of reverse engineering the Hive Thrift API.

To clarify, BeeJU is not intended to test Hive SQL scripts. Excellent tools already exist for this purpose such as HiveRunner (developed and contributed by Klarna).

BeeJU is just a small project but it has served us well and we wanted to share it with others. Have a look at the examples in the README and let us know how you’re using it!

--

--