Search engine using Elasticsearch: Basic Elasticsearch

Renaldi
Tunaiku Tech
Published in
6 min readNov 12, 2020
Advanced Searching and Grouping Illustration

Are you familiar with those cases above? Or have you ever curious about how to do that? Let me tell you that there’s something that can do it better than the relational SQL.

Elasticsearch

Meet Elasticsearch, a search engine so that your complex search query could be much more simple. It is an HTTP web interface and schema-free JSON document and developed using java. For better understanding purposes, you may call this an additional database that helps your main database such as MySQL, Postgres, etc. You can store data that requires complex searching feature in Elasticsearch.

Elasticsearch with any database illustration

Installation

the installation of elasticsearch is quite simple, you just have to go to its official sites here.

Download the elasticsearch

Then you can scroll down the page, to find the installation guide.

Installation Steps section

By default, elasticsearch will run on port 9200

The installation is quite simple, you don’t need to change any environment variable or put the elasticsearch folder somewhere, just run the bin/elasticsearch.bat and you are good to go.

Index

In the elasticsearch, we store data in something called index. If you are familiar with the terms “table” in a relational database, you may assume that the index is kinda like a table. But in the elasticsearch, there’s no such thing as a database. So if you want to differentiate the index between the other apps you can add the application name as a prefix of the index name such as

[application_name]_[index_name] .

To create an index, send an HTTP request to localhost:9200/{index_name} using PUT HTTP Method

Create an index in the elasticsearch using postman

to drop an index, simply change the HTTP method from PUT to DELETE

Insert

First thing first, to insert data to the elasticsearch, as I mentioned before It is an HTTP web interface, so all you have to do is create an HTTP request using POST HTTP method in localhost:9200/{index_name}/_doc/ then put your JSON data in the request body just like this:

If it succeeds then you will see "result": "created", in the response. The structure’s format is not strict but remember to make sure its data type is always the same. For example, I insert name as a string in the product index, you cannot insert new data with name as an integer.

Search

You can get all data from a certain index using search API. Simply hit localhost:9200/{index_name}/_search using the GET HTTP method this is similar to SELECT * FROM {index_name} in relational SQL Database.

Search API using postman

You can see the data inside the hits attribute on the JSON response.

Match Query

Now, what if you have a certain condition for your searching, not just get all data, you know that even relational SQL can do that right? in Elasticsearch, you can use match query in the search API like this:

GET localhost:9200/{index_name}/_search
{
"query":{
"match":{
"name":"laptop"
}
}
}

That is means that you want to search in the product index where the name contains “laptop” regardless of the upper/lowercase. But the cons using match query is you can’t add more than 1 condition. If you have more than 1 condition you should use the bool query.

Bool Query

What if you have more than 1 condition for your searching? you should use a bool query.

GET localhost:9200/{index_name}/_search
{
"query":{
"should":[
{
"match":{
"category":"Utensils"
}
},
{
"match":{
"name":"Frying Pan"
}
}
],
"filter":[
{
"range":{
"price":{
"gt":3000,
"lt":6000
}
}
}
]
}
}

From above you see that we can use an array of match queries inside the typed occurrence. The occurrence types are:

  • should
    Similar to OR in a relational database query.
  • must
    Similar to AND in a relational database query.
  • must_not
    Must appear as false.

You may also combine this typed occurrence.

"must":[
"match":{
...
}
...
],
"must_not":[
"match":{
...
}
...
],
"should":[
"match":{
...
}
...
]

And from above also you can see I use to filter query to filter the query result based on certain conditions.

Aggregation

Similar to other relational database queries, elasticsearch also has an aggregation query. In elasticsearch, aggregation is separated into 3 different categories:

  • Metric aggregation
    Aggregations that calculate metrics, such as a sum or average, from field values.
  • Bucket aggregation
    Aggregations that group documents into buckets, also called bins, based on field values, ranges, or other criteria.
  • Pipeline aggregation
    Aggregations that take input from other aggregations instead of documents or fields.

Metric Aggregation Example:

GET localhost:9200/{index_name}/_search
{
"aggs": {
"{name_of_aggregation}": {
"avg": {
"field": "price"
}
}
}
}

Bucket Aggregation Example:

{
"aggs":{
"categories":{
"terms":{
"field":"category.keyword"
}
}
}
}

Pipeline aggregation Example:

{
"aggs": {
"the_sum": {
"sum": { "field": "lemmings" }
},
"the_movavg": {
"moving_avg": { "buckets_path": "the_sum" }
}
}
}

Find out more detail about the aggregation query here.

Delete

To delete a document in elasticsearch you simply send an HTTP request using the DELETE HTTP method in localhost:9200/{index_name}/_doc/{id_of_document}

By doing that, you have marked the document with the certain id as deleted. So it’s more like soft delete but the document that has marked as deleted will be permanently deleted in the next 60 seconds (default).

Update

There are several ways to update a document in elasticsearch. You may update the whole document or you may also update the document partially.

Update the Whole Document

PUT localhost:9200/{index_name}/_doc/{id_of_document}

Update the Document Partially

POST localhost:9200/{index_name}/_update/{id_of_document}

If you’re updating a document, first the elasticsearch will mark the document as deleted, then elasticsearch will insert the updated documents with a newer version.

Conclusion

Advanced Searching / Grouping data could be complicated if you are using only Relational database queries such as MySQL, Postgres, etc. But if you are using elasticsearch, things could be easier and you could save some time.

References:

https://en.wikipedia.org/wiki/Elasticsearch
https://static-www.elastic.co/

--

--