ElasticSearch: Getting started

Uday Shankar
Apr 7, 2015 · 6 min read

Part 1

When I started looking at ElasticSearch and a good Front End tech stack for a project I am working on, I didn’t know that it would be so difficult. Finally when I figured it out, I couldn’t believe I had struggled with something this simple.

There are so many tutorials and code examples out there, but i couldn’t find one that listed the basics for someone coming from a SQL/RDBMS background. So, here I am writing this out so that Dummies like me can easily start with ElasticSearch and get going.

Since this post is going to be long, I am splitting it into a 2 parts:

  1. Working with ElasticSearch & Basics
  2. Front End for Search and Results

Part 1: Working with ElasticSearch

It is a distributed search engine designed for high performance searches.
The main features are

  1. Very powerful searches (uses Lucene Index)
  2. Distributed (data can be distributed to thousands of nodes)
  3. Real-time ( the data is available almost immediately after inserts)
  4. Easy to Scale ( can scale to thousands of nodes easily)

Some people confuse Elasticsearch with NoSQL Database but they are not same. ES do provides some of the Features of NoSQL but it primarily is a search and analysis Engine. [source]

And NO, you can’t use ElasticSearch as a Database. There are many posts on the internet that explain why its a bad idea. Check out these for starters.

From what I understand, you can create ‘rivers’ that connect elasticsearch with other databases. A river is a pluggable service running within elasticsearch cluster pulling data (or being pushed with data) that is then indexed into the cluster [source]. Check out some available rivers. You can probably create your own ones as well.

So, basically you need to have some data (any data) already indexed in your Elasticsearch instance to start working.

Let’s get ElasticSearch running on your boxes. Get the download instructions and binaries from Elastic.co’s download page.

Postman

A clean nice GUI is very critical if you are trying to figure out the basic concepts of a technology and unfortunately there are no good GUIs for ElasticSearch.

The only available decent option is Postman. Postman is a powerful API testing suite which has become a must-have tool for many developers. And since Elasticsearch supports Rest calls to access data, we can use Postman to work with ElasticSearch. Postman is a Chrome App and is pretty useful. Download and install it right away!

Postman has a neat feature called Collections. Collections is a set of actions that you save. These actions can be saved and run again. Pretty much like Macros. In this post, I’ll be talking about an index called TEAM. this index will contain ~10K user profiles. You can download this collection from here.

Basics

Some basic stuff that you might want to know if you are really new to ElasticSearch:

  1. ElasticSearch has Indexes. An Index is very similar to a table.
  2. An Index has properties. Properties are very similar to table structure definitions.
  3. Data access and manipulation is done through calls to REST urls and sending data objects over by POST, GET, PUT etc.

Disclaimer: I am probably over simplifying these concepts. To get a deeper understanding, checkout Elastic’s documentation.

To start with, lets check if your ElasticSearch is up and running ok. The easiest way is to point your browser to http://localhost:9200/. If your server is up and running, you should get see something like this in your browser:

{
status: 200,
name: “Gigantus”,
cluster_name: “elasticsearch_udayms”,
version: {
number: “1.4.4”,
build_hash: “c88f77ffc81301dfa9dfd81ca2232f09588bd512”,
build_timestamp: “2015–02–19T13:05:36Z”,
build_snapshot: false,
lucene_version: “4.10.3”
},
tagline: “You Know, for Search”
}

Creating an Index

Fire a PUT action to http://localhost:9200/team

Check if an Index exists

Fire a GET action to http://localhost:9200/team

Closing an Index

You would need to close an Index when you are trying to modify its properties (structure).

Fire a POST action to localhost:9200/team/_close

Mapping Columns (or defining Index structure)

Fire a PUT action to http://localhost:9200/team/_mapping/member

with the following object:

{
“properties”: {
“id”: { “type”: “string” },
“name”: { “type”: “string” },
“email”: { “type”: “string” },
“age”: { “type”: “integer” },
“phone”: { “type”: “string” },
“image”: { “type”: “string” },
“technologies”: {“type” : “string” }
}
}

Bulk Inserting data into the Index

Fire a PUT action to http://localhost:9200/team/member/_bulk

with the following data:

{“create”: { “_id”: 1, “_type”: “member”}
{“id”: “5510ce4ee174054836ef3c5a”,”name”: “Vargas Rosa”,”email”: “vargasrosa@zizzle.com”,”age”: 25,”phone”: “+1 (807) 530–3567”,”image”: “http://api.randomuser.me/portraits/men/78.jpg","description": “enim Lorem upidatat et nostrud ut irure qui qui nulla qui deserunt fugiat laborum elit”,”technologies”: “ios javascript python”}
{“create”: { “_id”: 2, “_type”: “member”}
{“id”: “5510ce4e24ecdab88fe18d06”,”name”: “Navarro Thornton”,”email”: “navarrothornton@zizzle.com”,”age”: 34,”phone”: “+1 (896) 579–3364”,”image”: “http://api.randomuser.me/portraits/men/59.jpg","description": “sit enim velit cillum magna commodo tempor”,”technologies”: “swift erlang java”}
{“create”: { “_id”: 3, “_type”: “member”}
{“id”: “5510ce4e6e7bbdbc120c9a89”,”name”: “Francine Aguirre”,”email”: “francineaguirre@zizzle.com”,”age”: 30,”phone”: “+1 (963) 492–3402”,”image”: “http://api.randomuser.me/portraits/men/82.jpg","description": “cu et sit ullamco tempor Lorem excepteur magna pariatur”,”technologies”: “javascript ionic ruby”}
{“create”: { “_id”: 4, “_type”: “member”}
{“id”: “5510ce4ebd2a509edd8c6b50”,”name”: “Krystal Simmons”,”email”: “krystalsimmons@zizzle.com”,”age”: 40,”phone”: “+1 (857) 418–2040”,”image”: “http://api.randomuser.me/portraits/women/10.jpg","description": “ea dolor ex proident eiusmod et ut irure esse”,”technologies”: “ruby c c”}

I used JSON Generator to create this dummy data.

Listing all data (or Select * from table)

Fire a GET action to 
localhost:9200/team/_search?size=500&pretty=true&q=*:*

Searching for a specific data

Fire a POST action to localhost:9200/team/_search?pretty=true

and pass the following object:

{
“size”: 100,
“query”: {
“match”: {
“_all”: {
“query”: “jackson golang”,
“operator”: “and”
}
}
}
}

Deleting an Index

Fire a DELETE action to http://localhost:9200/team

Generating some dummy data

Generating sufficient dummy data is one of the first thing to do if you are trying to play around with elasticsearch for the first time.

I used the nifty tool provided by the folks at Json-Generator to generate ample data. Once the JSON is generated, I used the Sublime Text editor’s powerful features to find-and-replace stuff in the generated json to get it into the right format for elasticsearch. I specifically mention Sublime Text because the large amount of data generated by the generator when put into a file might crash other editors.

JSON-Generator

Json-Generator is a powerful tool that allows you to generate dummy JSON data based on a simple schema.

I used the following script multiple times to generate up to 15000 rows of data.

[
‘{{repeat(5000)}}’,
{
create: { ‘_id’: ‘{{index(10001)}}’, ‘_type’: ‘member’ },
id: ‘{{objectId()}}’,
name: ‘{{firstName()}} {{surname()}}’,
email: ‘{{email()}}’,
age: ‘{{integer(20, 40)}}’,
phone: ‘+1 {{phone()}}’,
image: ‘http://api.randomuser.me/portraits/{{random("men", “women”)}}/{{integer(1, 96)}}.jpg’,
description: ‘{{lorem(200, “words”)}}’,
technologies: ‘{{random(“c”, “python”, “javascript”, “shell”, “java”, “ruby”, “erlang”, “lisp”, “golang”, “android”, “ios”, “swift”, “ionic”, “delphi”, “php”, “laravel”, “assembly”) }} {{random(“c”, “python”, “javascript”, “shell”, “java”, “ruby”, “erlang”, “lisp”, “golang”, “android”, “ios”, “swift”, “ionic”, “delphi”, “php”, “laravel”, “assembly”) }} {{random(“c”, “python”, “javascript”, “shell”, “java”, “ruby”, “erlang”, “lisp”, “golang”, “android”, “ios”, “swift”, “ionic”, “delphi”, “php”, “laravel”, “assembly”) }}’

}
]

Now, all you have to do is use Postman to insert the generated data into ElasticSearch and your setup is ready for experiments.

Using Postman

Let’s now get this data into your ElasticSearch index using Postman. The below screenshots should give you a pretty clear idea of how to insert data into ElasticSearch using the Postman interface.

What you see when you fire a GET request to localhost:9200.
Creating a new index called TEAM using PUT
Using the RAW section to attach json data as body to the PUT request being fired

It’s pretty straight forward. You use the correct url and correct action (put, get, post etc) from dropdown beside the url field and click on the send button.For posting data there multiple ways. If you already have data you can use the RAW section as shown in the screenshots.

OR… you can just import the collection, I have uploaded here and run it one command at a time.

So, here we are — at the end of the part 1 of this post. We have covered:

  1. Downloading ElasticSearch
  2. Setting up ElasticSearch
  3. Installing Postman to use as ElasticSearch Client
  4. Generating Dummy Data using Json-Generator
  5. Using Postman to insert the generated data into ElasticSearch Index

That sets us up well for the next part of this post. Creating a sample app to access and use this indexed data in a front end layer.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade