Hands-on Introduction to Elasticsearch — II

We learnt how to install ElasticSearch and confirm it’s up and running in the last instalment of this series. Let’s look at various ways in which you can insert data in your ElasticSearch cluster in this post.

Indices

Before we try to insert documents into ElasticSearch, we need to understand that each document would be inserted into an index (reverse index in a way). Let’s find out if your machine has any indices.

$ curl -XGET 'localhost:9200/_cat/indices?v&pretty'

If you are like me the output might look something like this:

health status index uuid pri rep docs.count docs.deleted store.size    pri.store.size

Yeah, only headers and no output. Let’s fix this.

Creating an index

I will go with superheroes (Both Marvel and DC) for our example; if you are not a fan become one.

$ curl -XPUT 'localhost:9200/superheroes?&pretty'

That’s it. Your output might look like this:

{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "superheroes"
}

You can confirm that the command work and our superheroes have a home by using the same indices command from above.

$ curl -XGET 'localhost:9200/_cat/indices?v&pretty'
Output:
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open superheroes aK55VmlxSiyRVHGWPeP1hw 5 1 0 0 1.1kb 1.1kb

I would really like the health to be green but we will solve that problem (replicas) some other day; stay with me here.

At this point you can create indices for recipes, books, e-commerce product catalogue or other documents you might like, we are about to insert some documents.

Adding a document

Our trusted curl command with PUT HTTP verb will come in handy here as well:

curl -H "Content-Type: application/json" -XPUT 'localhost:9200/superheroes/avenger/1?pretty' -d'
{
"name": "Steve Rogers",
"super_hero_name": "Captain America",
"primary_weapon": "Vibranium Shield",
"quotes": ["I can do this all day."]
}
'

Yes, that’s your first Avenger. Output should look something like this:

{
"_index" : "superheroes",
"_type" : "avenger",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}

This changes everything If you query the index docs.count value should show up as 1. Try it on your own.

I feel like adding another Avenger here. Steve Rogers has already spent many lonely years frozen.

curl -H "Content-Type: application/json" -XPUT 'localhost:9200/superheroes/avenger/2?pretty' -d'
{
"name": "Tony Stark",
"super_hero_name": "Ironman",
"primary_weapon": "Armor Suit",
"quotes": ["I am ironman", "Not the drycleaning kind"]
}
'

Output:

{
"_index" : "superheroes",
"_type" : "avenger",
"_id" : "2",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 2
}

What is happening here?

There are a lot of small details here and I would break down and explain them in parts.

  • -H “Content-Type: application/jsonis important, curl ‘can’ recognise the content type but we should help. You don’t want to see an error related to x-www-form-urlencoded when submitting a curl command.
  • An Elasticsearch cluster can have multiple indices and we add the index name in URL along with document type- /superheroes/ and /avenger respectively.
  • The URL is followed by a document id /1 or /2.
  • There is no predefined schema, the submitting document is a JSON that can have nested arrays.

Is all this working?

We have given homes to all these superheroes but let’s check if we can retrieve the details back.

curl -XGET 'localhost:9200/superheroes/avenger/1?pretty'

Output:

{
"_index" : "superheroes",
"_type" : "avenger",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"name" : "Steve Rogers",
"super_hero_name" : "Captain America",
"primary_weapon" : "Vibranium Shield",
"quotes" : [
"I can do this all day."
]
}
}

I would cover more topics including updating documents and many ways to retrieve / search these documents in future posts. Do let me know if there any specific topics that future posts should cover.

Links to posts in this series:

  1. Installation
  2. Inserting indices and documents (current post)
  3. Updating and deleting documents