What is ElasticSearch? Why ElasticSearch? Advantages of ElasticSearch!

4 min readJul 31, 2018

What is Elasticsearch?

Elasticsearch is a search engine built on apache lucene. It is an open source and developed in Java. It is a real time distributed and analytic engine which helps in performing various kinds of search mechanism. It is able to achieve fast search responses because, instead of searching the text directly, it searches an index instead. Additionally, it supports full-text search which is completely based on documents instead of tables or schemas.

Why Elasticsearch?

● One can perform and combine various kind of searches irrespective of their data type which included structured, unstructured, geo and metrics data type.

● Query can be retrieved data in any form required.

● Possible to analyze billions of records in few seconds.

● It also provides aggregations which can explore trends and patterns of data.

Advantages of Elasticsearch

1. Scalability

Elasticsearch is built to scale. It will run perfectly fine on any machine or in a cluster containing hundreds of nodes, and the experience is almost identical. Growing from a small cluster to a large cluster is almost entirely automatic and painless. Growing from a large cluster to a very large cluster requires a bit more planning and design, but it is still relatively painless. Scalability is consider on below dimensions.

● Index size: Being able to manage huge indexes (in the order of hundreds of Gigabytes or Petabytes)

● Throughput: Being able to manage various amount of simultaneous searches under a certain response time.

● Cluster size: The number of nodes in the system

2. Fast performance

By using distributed inverted indices, Elasticsearch quickly finds the best matches for your full-text searches from even very large data sets.

3. Multilingual

The ICU plugin is used to index and tokenize multilingual content which is an elasticsearch plugin based on the lucene implementation of the unicode text segmentation standard. Based on character ranges, it decides whether to break on a space or character. Therefore, Multilingual are supported in Elasticsearch.

4. Document oriented (JSON)

Elasticsearch uses JavaScript Object Notation, or JSON, as the serialization format for documents. JSON serialization is supported by various programming languages, and has become the standard format used by the NoSQL movement. It is simple, concise, and easy to read.

5. Auto-completion and instance search

The completion suggester provides autocomplete/search-as-you-type functionality. This is a navigational feature to guide users to relevant results as they are typing, improving search precision. It is neither meant for spell correction nor did-you-mean functionality like the term or phrase suggester.

6. Schema free

Elasticsearch does not require some definitions such as index, type, and field type before the indexing process, and when an object is indexed later with a new property, it will automatically be added to the mapping definitions.

Basic Concepts

● Near Real Time: Elasticsearch is a near real time search platform which perform search as quickly as you index a document.

● Cluster: A cluster is a collection of one or more nodes that together holds the entire data. It provides federated indexing and search capabilities across all nodes and is identified by a unique name (by default it is ‘elasticsearch’).

● Node: A node is a single server which is a part of cluster, stores data and participates in the cluster’s indexing and search capabilities.

● Index: An index is a collection of documents with similar characteristics and is identified by a name. This name is used to refer to the index while performing indexing, search, update, and delete operations against the documents in it.

● Type: A type is a logical type of an index whose semantics is complet. It is defined for documents that have a set of common fields. you can define more than one type in your index.

● Document: A document is a basic unit of information which can be indexed. It is demonstrated in JSON which is a global internet data interchange format.

● Shards: Elasticsearch provides the ability to subdivide the index into multiple pieces called shards. Each shard is in itself a fully-functional and independent “index” that can be hosted on any node within the cluster

● Replicas: Elasticsearch allows you to make one or more copies of your index’s shards which are called replica shards or replica.

Installation

● Install the latest java version or check your current version by using “java -version” command in command line prompt (Java version should be 7 or more)

● Set environment variable for JAVA

● Download elastic zip file from “https://www.elastic.co/downloads/elasticsearch”.

● Unzip the file

● Go to bin folder

● Double click on “elasticsearch.bat” file

● Open a browser, type “localhost:9200” and it will show you name, cluster name of elasticsearch and other information in JSON format.

Example

1. Add Document

Documents in Elasticsearch are represented in JSON format. Also, documents are added to indices, and documents have a type. Here,”information_technology”,”person” and ”1” are index, type and id respectively. Since the index does not exist yet, Elasticsearch will automatically create it.

POST localhost:9200/information_technology/person/1
{
“name” : “Paul”,
“lastname” : “Smith”,
“job_description” : “Business Analyst”
}

2. Get Document

Now that the document exists, we can retrieve it using below API.

GET localhost:9200/information_technology/person/1

3. Update Document

We can update it using below API.

POST localhost:9200/information_technology/person/1/_update
{
“doc”:{
“job_description” : “Data analyst”
}
}

4. Delete Document

We can delete it using below API.

DELETE localhost:9200/information_technology/person/1

5. Search

We can search it using either “/_search?q=something” or specifying category to search.

GET localhost:9200/_search?q=Paul
OR
GET localhost:9200/_search?q=job_description:java

For any queries feel free to reach us on marketing@aimdek.com