Installing Elasticsearch and first steps to the API

Phase 01 — Introduction to Elasticsearch — Blog 03

Published in

elasticsearch

6 min readDec 9, 2017

So far in this series,I have been writing about the general things about Elasticsearch and about the components of the Elastic stack. From this article we will start diving into the elasticsearch APIs. In this article we will primarily focus on the installation of Elasticsearch and then learn how to use the basic CRUD APIs provided by Elasticsearch. We will also install a third party application called elasticsearch-head to view the changes in the UI.

1. Installing Elasticsearch

As starters, let us first install and configure the Elasticsearch in our system. For this tutorial, I have used Ubuntu 16.04 as the OS on a machine with 8GB of RAM.

1.1 Java installation

As we have seen in the previous blogs that Elasticsearch is built on top of a library called Lucene, which inturn is built on top of Java. So Java is a prerequisite for the installation of Elasticsearch. Following are the steps for installing Java in the machine:

sudo add-apt-repository ppa:webupd8team/java -y
sudo apt-get update
sudo apt-get install oracle-java8-installer

1.2 Elasticsearch installation

Let us see how we can install the Elasticsearch as a service here.

Download the latest version of Elasticsearch(5.6.3, at the time of writing this blog) “.deb” file here
Type in sudo dpkg -i elasticsearch-5.6.3.deb
After the above installation is complete, type in sudo service elasticsearch start to start the service.

This will install and start elasticsearch in your local enviornment as a service.

The default port that elasticsearch runs is 9200. To check whether it is running, just type in the below command in the terminal:

curl localhost:9200

The above command will result in the response as shown:

{
  “name” : “9CCT_A1”,
  “cluster_name” : “elasticsearch”,
  “cluster_uuid” : “QqZcNgcdRDW8sWMaLNf-Jg”,
  “version” : {
    “number” : “5.6.3”,
    “build_hash” : “1a2f265”,
    “build_date” : “2017–10–06T20: 33: 39.012Z”,
    “build_snapshot” : false,
    “lucene_version” : “6.6.1”
  },
  “tagline” : “YouKnow,
  forSearch”
}

1.3 Configuration Files

One of the most important things in the Elasticsearch world, is to configure it properly. There are two important configuration files in Elasticsearch which we should be familiar with. These are :

1.3b elasticsearch.yml

This configuration file allows for a lot of configuration options, like changing the port of elasticsearch, defining the nodes in the cluster, addressing the cors issues etc.

The location for this config file, is under the folder “etc/elasticsearch”. Here you can see the elasticsearch.yml file.

1.3b jvm.options.yml

The configurations in this yml file was included in the elasticsearch.yml for the versions prior to 5.x. The configurations here, take care of the Java virtual machine memory management. The location of this configuration file is also under etc/elasticsearch. We will dwelve into this in detail in our future blogs.

2. Index,type and document in Elasticsearch

At this point we have managed to install elasticsearch in our system. Now let us be familiar with the basic data storage model in Elasticsearch.As we have mentioned in our previous blog, Elasticsearch is a NoSql data base. So here instead of the database, tables,rows heirachy in the SQL world, the closest heirachial analogy is the index,types and documents. This means,when a document (which should be in JSON format) is saved in Elasticsearch, its adress would look like

index name: This is similar to the database name in SQL world. This is a mandatory piece of information. Elasticsearch may contain many indices, and the document to be stored should be supplied with the index name failing which will result in error as Elasticsearch cannot figure out which index the document belong. Also the index name does not support uppercase and some special characters.

type name: types in Elasticsearch are similar to the tables under databases in the SQL world. So an index can have multiple tables under it. And tables can have multiple documents under it. If we are not giving the type name with a document, still elasticsearch will index with a default type name.

document ID : a unique id for the document. This can be provided by the user who is putting the document to Elasticsearch or if it is not provided, Elasticsearch would auto generate a unique value.

Note: the combination of “index name+type name+document id” will be unique for each document in elasticsearch

The following diagram shows how a typical elasticsearch database would look like with multiple indices having documents.

3 CRUD operations — command line

Now we have a basic idea about the data heirachy in Elasticsearch. In this section, let us perform some of the basic CRUD operations in Elasticsearch using the command line interface.

3.1 Create an index

From the previous section,we know that to store a document in Elasticsearch , we need to specify the index name. So it is important that we create an index before any such document storing. Let us create an index named “test_index_01” from the terminal as below:

curl -XPUT localhost:9200/test_name_01

The above command will yield a response as shown below:

{
 “acknowledged”: true,
 “shards_acknowledged”: true,
 “index”: ”test_index_01" 
}

3.2 Create a document

Now that we have created an index, we can index a document to elasticsearch.

In this case, we are indexing (storing) a document with a document id equal to 1. This can be done as below:

curl -XPUT localhost:9200/test_index_01/test_type_01/1 -d ‘{
 “name”: ”ArunMohan”,
 “age”: 32
}’

In the above request following are the split information of the data we are passing to elasticsearch

indexname: test_index_01
type_name: test_type_01
documentid: 1
document : {
  “name”: “ArunMohan”,
  “age”: 32
}

The above request will result in the response as below:

{
  _index: test_index_01,
  _type: test_type_01,
  _id: 1,
  _version: 1,
  result: created,
  _shards: {
    total: 2,
    successful: 1,
    failed: 0
  },
  created: true
}

In the response, again we can see the index name (“_index”), the type name (“_type”), the document id (“_id”). Also the status of the operation as the “created” value. The value for “created” is true indicating the document indexing was successful.

3.3 Read a document

A document can be retrieved from elasticsearch using a GET request with the index name, type name and the document id specified in it. This acts as the accurate adress of that document (provided all the three information passed are accurate) and Elasticsearch will fetch the document for us. Let us see how to retrieve the document we just now indexed.

curl -XGET localhost:9200/test_index_01/test_type_01/1

The above request will return a response like below:

{
  _index: test_index_01,
  _type: test_type_01,
  _id: 1,
  _version: 1,
  found: true,
  _source: {
    name: ArunMohan,
    age: 32
  }
}

In the above response we can see the document is under the “_source” object of the response. And the meta data consists of the other informations and the status of retrieval as “found”.

3.4 Update a document

What if need to update a field of an already indexed document?. Elasticsearch provides us with an update API for this operation. In our example, let us say I want to update the age field with a new value of 31. The request for this is shown as below:

curl -XPOST localhost:9200/test_index_01/test_type_01/1/_update -d '{"doc":{"age":31}}'

As you might have noted, I have given only the required field and the new value for that field ({“age”:31}) in the request. That too under an object named “doc” . The request also contains all the information about the document to be updated (index name, type name and the document id), so that Elasticsearch will find that document and make the changes to that specific field (this is not how it exactly works, but for the time being we are diving deep) . Now if no such field exist in the document, Elasticsearch will make one such field in the document.

For the above request we will get the below response:

{
  "_index": "test_index_01",
  "_type": "test_type_01",
  "_id": "1",
  "_version": 2,
  "result": "updated",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  }
}

In the result field in the above response, we can see the status as “updated”, which indicates that the updation was successfull.

3.5 Delete a document

The delete is similar to the read of mentioned early. Just give the index name, type name and the document id of the document to be deleted along with the request and get it deleted, like below:

curl -XDELETE localhost:9200/test_index_01/test_type_01/1

This would get us a response like below:

{
  "found": true,
  "_index": "test_index_01",
  "_type": "test_type_01",
  "_id": "1",
  "_version": 3,
  "result": "deleted",
  "_shards": {
    "total": 2,
    "successful": 1,
    "failed": 0
  }
}

Conclusion

In this article we have seen the installation of Elasticsearch and then the basic CRUD operations on it. In the next blog to this series we will see how to work with multiple elasticsearch instances on the same system.