4 Steps To Setting Up The Perfect Elasticsearch Test Environment

Effortlessly test Elasticsearch queries using this test environment, including test data and terrific GUIs.

It’s not hard to create an awesome test environment

When using Elasticsearch, you often need to do some experimenting. Especially when it comes to the more exotic or dangerous queries, like boolean search queries, update queries, and delete queries, you’ll want a test environment that you can safely use instead of abusing your production deployment.

In this quick guide, you’ll create a test environment that you can fire up and preload with some test data with minimal effort.

A word of warning: I assume you’re working in a Unix-like environment, e.g. Linux or MacOS.

1. Install Elasticsearch

Let’s start by firing up a terminal (is yours awesome yet?). Create a directory that will serve as the base of your test environment. For me, that’s:

The next step is to download Elasticsearch into this directory. You can download it from https://www.elastic.co/downloads/elasticsearch

I recommend the “Linux” or “Mac” .tar.gz links, not the .deb or .rpm versions. This way you can isolate your environment to the directory we just created and be sure you won’t have to mess with system-wide configs and data directories.

At the time of writing this article, the latest version is 7.4.2, so that’s what you’ll see in these examples. You should download the latest version there is.

Now untar the .tar.gz file, with something like:

If all went well, you have a directory called . The beauty is that you can install multiple versions this way with ease since you can simply install other versions besides this one in their own, versioned directory.

Elasticsearch has a terrific out-of-the-box setup. You don’t need to edit config files.

To run elasticsearch, start it with:

You can verify that Elasticsearch is running by visiting the REST API through your browser: http://localhost:9200/

You should see something like:

{
"name" : "mbp.local",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "F5VboRxcTuu7MSEPGH-LXQ",
"version" : {
"number" : "7.4.2",
"build_flavor" : "default",
"build_type" : "tar",
"build_hash" : "2f90bbf7b93631e52bafb59b3b049cb44ec25e96",
"build_date" : "2019-10-28T20:40:44.881551Z",
"build_snapshot" : false,
"lucene_version" : "8.2.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}

Once you start experimenting with queries, it might be nice to know that you can tail the log files in the directory inside your Elasticsearch installation.

For more installation options, check out the official documentation.

2. Single node installation advice

You are running on a single node, which requires some extra care.

First of all, alway create your indices with a single shard and no replicas.

A single shard saves you resources. For testing purposes, a single shard is usually enough. Using no replicas prevents Elasticsearch from going into a ‘yellow’ state because it wants to assign replicas to other nodes, which there aren’t.

If you are low on disk space (I, for one, am always low on disk space), you might want to add the following setting to

This disables the safety thresholds that are very useful on production deployments, but unneeded on a small test setup.

3. Running queries: pick your poison

There are lots of ways to fire a query at a REST API. It comes down to personal preference and the situation at hand. I will list a few convenient ways so you can pick your poison.

Command-line

Using the command line to work with JSON and REST APIs is not the most user-friendly. However, sometimes it’s all you have at hand. In those cases, you’ll be glad you know your way around tools like curl and jq.

I find curl the easiest tool to use since it is comes preinstalled on most Linux distributions. To get the same page we’ve seen before, but now on the command-line, you can use The means ‘silent’, it helps in not filling up your terminal with useless progress bars and such.

If you never heard of , it’s time to fire up apt-get, yum, brew or whatever your OS uses to install packages. is a powerful command-line JSON processor. I use it mainly for syntax highlighting but it can do a lot more. If you ever need to do some bash scripting in combination with JSON you should definitely read up on its features.

The output of curl -s localhost:9200 | jq
The output of curl -s localhost:9200 | jq
Using curl and jq together to create a beautiful JSON output

Curl can perform all the HTTP operations with the parameter. To demonstrate, let’s create an index with a PUT request:

Creating an index with curl -X PUT

Web interfaces

If you have more at hand than the command line, you may be better off with a nice GUI. Here are just two favorites of mine. If you know anything better, please share it with all of us in the comments section!

Kibana
Elastic, the company behind Elasticsearch, has created many tools for the Elasticsearch ecosystem. One of the more prominent tools is Kibanaa powerful web application that allows you to create powerful visualizations, dashboards and explore your data.

You can download Kibana here: https://www.elastic.co/downloads/kibana

My advice, again, is to download the tarball (tar.gz) version for your operating system and install it in its own directory in your test environment. Kibana’s version number follows that of Elasticsearch, so in my case, I ended up with a directory called . To start it, run:

A Kibana Dashboard showing insights into the e-commerce sample data

After it has booted, which can take a while, you can visit the webpage at http://localhost:5601

Kibana will offer you to load some sample data. Go ahead and let it, it allows you to experiment with all the features Kibana offers.

In the example screenshot, you can see the e-commerce dashboard.

If you go into Kibana’s ‘Dev Tools’ section, you can fire JSON requests manually.

Indexing a document using Kibana’s Dev Tools

Cerebro
This is my personal favorite. Cerebro is lightweight and it gives a nice overview of the running cluster nodes, the indexes, server load, disk usage, et cetera. On top of that, it allows you to create and delete indices, manage aliases, manage index templates, and much more.

Above all, it has a nice REST interface in which you can create arbitrary queries of all types (POST, PUT, GET, DELETE). This interface includes a JSON syntax checker, a curl command generator and a convenient query history.

Using Cerebro’s rest interface to PUT some data in a test index

To install Cerebro: go to the downloads page, download the .tgz file and extract it just like we did with the Elasticsearch tarball. You can then run it with the command:

$

Visit http://localhost:9000 to use the interface.

4. A good set of test data

What does good test data entail? Let’s consider two situations.

First situation: If you have specific data on which you want to perform specific queries, it’s best to just load a (partial) copy of that production data into your test environment.

Second situation: for fiddling around, you want to have something at hand already for all types of queries, so you don’t have to spend time creating a data set each time.

Here’s a non-exhaustive list of stuff you might want to fiddle with:

  • Inner documents vs. nested documents
  • parent-child relationships
  • arrays
  • numbers
  • dates / times
  • text and keywords
  • Geodata (geo points, hashes)
  • Geo shapes
  • IP addresses

I could try to craft a very nifty test set that exploits all these features. Instead, I plan on writing more articles, including test data and example queries, to explain the more advanced Elasticsearch features in more detail. So for now, I would like to refer you to the test sets we’ve loaded earlier when we installed Kibana.

If you inspect it closely, you’ll see that they contain all kinds of data. The e-commerce test set contains numbers, geo points, inner documents, timestamps, and floats (numbers). In addition, the data logs test set contains IP addresses.

There’s more than enough to start experimenting!

Final notes

You now have a working Elasticsearch installation with test data at hand. One that you can fire up whenever you need it, and that you can easily move around. You could even run it from a USB stick.

I deliberately didn’t mention Docker in this article, since it is one extra dependency you’d need to install and learn about. If you already use Docker, this is definitely a nice alternative to create a test environment. There is a thorough guide on running Elasticsearch on Docker here. By using Docker, it’s relatively easy to set up a multi-node cluster too.

Further reading

You are now ready to continue to my Elasticsearch Hands-on tutorial:

I also wrote a more introductory article on Elasticsearch:


If you liked this article and want to be notified when I post new stuff, please subscribe to my mailing list.

Tech Explained

Understandable, practical and useful explanations of technology

Erik-Jan van Baaren

Written by

A writer at heart and software engineer by profession. Specialized in data engineering. Follow me on https://techexp.substack.com/

Tech Explained

Understandable, practical and useful explanations of technology

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade