Introduction to Elasticsearch

Yunlong Wang
Feb 6 · 4 min read

What is Elasticsearch?

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. — Wikipedia

Cluster

Elasticsearch Cluster consists of one or many machines or nodes which host your data called documents. The default name of a cluster is “elasticsearch”.

Shard

Elasticsearch allows you to divide your dataset into pieces called shards and store them in separate nodes.

Here is a graph that might give you a better idea of the relations between cluster, node and shard:

Kibana

Kibana is a web interface where you can write REST request to interact with Elasticsearch cluster.

Schema/Mapping

When people talk about defining a schema in Elasticsearch, they usually use the word mapping and the explanation is this which I don’t quite get:

The schema in Elasticsearch is a mapping that describes the the fields in the JSON documents along with their data type, as well as how they should be indexed in the Lucene indexes that lie under the hood. Because of this, in Elasticsearch terms, we usually call this schema a “mapping”.

So Elasticsearch mapping is basically same as a database schema which defines the fields of the document with their data type.

Here is an example:

{
"properties": {
"land_size": {
"type": "integer"
},
"num_of_bedroom": {
"type": "integer"
},
"num_of_bathroom": {
"type": "integer"
},
"address": {
"properties": {
"street_num": {
"type": "integer"
},
"street_name": {
"type": "keyword"
},
"suburb": {
"type": "keyword"
}
}
}
}
}

Now let’s create a mapping.

Index

Before we create a mapping, we need to first have an index created:

PUT real-estate
{}

Response from Elasticsearch:

{
"acknowledged": true,
"shards_acknowledged": true
}

Now let’s create a mapping for document type property:

PUT real-estate/_mapping/property
{
"properties": {
"land_size": {
"type": "integer"
},
"num_of_bedroom": {
"type": "integer"
},
"num_of_bathroom": {
"type": "integer"
},
"address": {
"properties": {
"street_num": {
"type": "integer"
},
"street_name": {
"type": "keyword"
},
"suburb": {
"type": "keyword"
}
}
}
}
}

We can list all indices to check if the real-estate index has been created:

GET /_cat/indices?v

Document type and Data type

Before we move on, let’s briefly take a look at document type and date type.

To me, document type is equivalent to table name of relational database. Usually the type name is the name of the entity the document holds. For example, if the index is real-estate, one of the document types can be property. Another one is agent.

Date type is the type of a field of a document. Elasticsearch has types like string, integer, date and etc.

Here is an example which shows that a node can have multiple indices (e.g. real-estate) and an index has multiple document types (e.g. property) with fields defined with data types (e.g. keyword and integer).

Now let’s create a document in the property index:

POST real-estate/property
{
"land_size": 650,
"num_of_bedroom": 3,
"num_of_bathroom": 2,
"address": {
"street_num": 1,
"street_name": "Beats Street",
"suburb": "Yolo"
}
}

Response from ES:

{
"_index": "real-estate",
"_type": "property",
"_id": "AWi9qUAqkCC778f8cqUO",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 2,
"failed": 0
},
"created": true
}

We can search for the document using Elasticsearch query:

GET real-estate/property/_search
{
"query": {
"term": {
"address.suburb": {
"value": "Yolo"
}
}
}
}

Here is the response from ES:

{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "real-estate",
"_type": "property",
"_id": "AWi9qUAqkCC778f8cqUO",
"_score": 0.2876821,
"_source": {
"land_size": 650,
"num_of_bedroom": 3,
"num_of_bathroom": 2,
"address": {
"street_num": 1,
"street_name": "Beats Street",
"suburb": "Yolo"
}
}
}
]
}
}

Summary

Here you go, a very introductory explanation of Elasticsearch with a basic example. I don’t expect you to like my terrible hand drawings but overall hope you like the content.

Stay tuned, there will be more blog posts coming up that dive deeper into the power of Elasticsearch.

Yunlong Wang

Written by

Software Engineer @ REA Group

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade