Kalpana Kanade
Petabytz
Published in
5 min readSep 14, 2019

--

Elasticsearch Service on Elastic Cloud

Elasticsearch

Elasticsearch is an Apache Lucene-based search server. It was developed by Shay Banon and published in 2010. It is now maintained by Elasticsearch BV. Its latest version is 7.0.0.

Elasticsearch is a real-time distributed and open source full-text search and analytics engine. It is accessible from RESTful web service interface and uses schema less JSON (JavaScript Object Notation) documents to store data. It is built on Java programming language and hence Elasticsearch can run on different platforms. It enables users to explore very large amount of data at very high speed.

General Features

The general features of Elasticsearch are as follows −

  • Elasticsearch is scalable up to petabytes of structured and unstructured data.
  • Elasticsearch can be used as a replacement of document stores like MongoDB and RavenDB.
  • Elasticsearch uses denormalization to improve the search performance.
  • Elasticsearch is one of the popular enterprise search engines, and is currently being used by many big organizations like Wikipedia, The Guardian, StackOverflow, GitHub etc.
  • Elasticsearch is an open source and available under the Apache license version 2.0.

Key Concepts

The key concepts of Elasticsearch are as follows −

Node

It refers to a single running instance of Elasticsearch. Single physical and virtual server accommodates multiple nodes depending upon the capabilities of their physical resources like RAM, storage and processing power.

Cluster

It is a collection of one or more nodes. Cluster provides collective indexing and search capabilities across all the nodes for entire data.

Index

It is a collection of different type of documents and their properties. Index also uses the concept of shards to improve the performance. For example, a set of document contains data of a social networking application.

Document

It is a collection of fields in a specific manner defined in JSON format. Every document belongs to a type and resides inside an index. Every document is associated with a unique identifier called the UID.

Shard

Indexes are horizontally subdivided into shards. This means each shard contains all the properties of document but contains less number of JSON objects than index. The horizontal separation makes shard an independent node, which can be store in any node. Primary shard is the original horizontal part of an index and then these primary shards are replicated into replica shards.

Replicas

Elasticsearch allows a user to create replicas of their indexes and shards. Replication not only helps in increasing the availability of data in case of failure, but also improves the performance of searching by carrying out a parallel search operation in these replicas.

Advantages

  • Elasticsearch is developed on Java, which makes it compatible on almost every platform.
  • Elasticsearch is real time, in other words after one second the added document is searchable in this engine
  • Elasticsearch is distributed, which makes it easy to scale and integrate in any big organization.
  • Creating full backups are easy by using the concept of gateway, which is present in Elasticsearch.
  • Handling multi-tenancy is very easy in Elasticsearch when compared to Apache Solr.
  • Elasticsearch uses JSON objects as responses, which makes it possible to invoke the Elasticsearch server with a large number of different programming languages.
  • Elasticsearch supports almost every document type except those that do not support text rendering.

Disadvantages

  • Elasticsearch does not have multi-language support in terms of handling request and response data (only possible in JSON) unlike in Apache Solr, where it is possible in CSV, XML and JSON formats.
  • Occasionally, Elasticsearch has a problem of Split brain situations.

Comparison between Elasticsearch and RDBMS

In Elasticsearch, index is similar to tables in RDBMS (Relation Database Management System). Every table is a collection of rows just as every index is a collection of documents in Elasticsearch.

Elasticsearch — Installation

In this chapter, we will understand the installation procedure of Elasticsearch in detail.

To install Elasticsearch on your local computer, you will have to follow the steps given below −

Step 1 − Check the version of java installed on your computer. It should be java 7 or higher. You can check by doing the following −

In Windows Operating System (OS) (using command prompt)−

> java -version

In UNIX OS (Using Terminal) −

$ echo $JAVA_HOME

Step 2 − Depending on your operating system, download Elasticsearch from www.elastic.co as mentioned below −

  • For windows OS, download ZIP file.
  • For UNIX OS, download TAR file.
  • For Debian OS, download DEB file.
  • For Red Hat and other Linux distributions, download RPN file.
  • APT and Yum utilities can also be used to install Elasticsearch in many Linux distributions.

Step 3 − Installation process for Elasticsearch is simple and is described below for different OS −

  • Windows OS− Unzip the zip package and the Elasticsearch is installed.
  • UNIX OS− Extract tar file in any location and the Elasticsearch is installed.
$wget
https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch7.0.0-linux-x86_64.tar.gz
$tar -xzf elasticsearch-7.0.0-linux-x86_64.tar.gzUsing APT utility for Linux OSDownload and install the Public Signing Key$ wget -qo - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo
apt-key add -

Save the repository definition as shown below −

$ echo "deb https://artifacts.elastic.co/packages/7.x/apt stable main" |
sudo tee -a /etc/apt/sources.list.d/elastic-7.x.list

Run update using the following command −

$ sudo apt-get update

Now you can install by using the following command −

$ sudo apt-get install elasticsearch

Using YUM utility for Debian Linux OS

$ rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch

  • ADD the following text in the file with .repo suffix in your “/etc/yum.repos.d/” directory. For example, elasticsearch.repo
elasticsearch-7.x]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
  • You can now install Elasticsearch by using the following command
sudo yum install elasticsearch

Step 4 − Go to the Elasticsearch home directory and inside the bin folder. Run the elasticsearch.bat file in case of Windows or you can do the same using command prompt and through terminal in case of UNIX rum Elasticsearch file.

In Linux

$ cd elasticsearch-2.1.0/bin
$ ./elasticsearch

Step 5 − The default port for Elasticsearch web interface is 9200 or you can change it by changing http.port inside the elasticsearch.yml file present in bin directory. You can check if the server is up and running by browsing http://localhost:9200. It will return a JSON object, which contains the information about the installed Elasticsearch in the following manner −

{
"name" : "Brain-Child",
"cluster_name" : "elasticsearch", "version" : {
"number" : "2.1.0",
"build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87",
"build_timestamp" : "2015-11-18T22:40:03Z",
"build_snapshot" : false,
"lucene_version" : "5.3.1"
},
"tagline" : "You Know, for Search"
}

--

--