Searching and Filtering with Elastic Search NodeJs Client
ES: Elastic Search
Prerequisites
In our previous articles on Elastic Search, we set up an ES client and indexed documents. Now, we will use that client to perform various operations such as searching, sorting, pagination, and filtering on our indexed documents. By learning how to use these features, we can further optimize the search and analysis capabilities of Elastic Search in our projects. Let’s dive in and see how these functions work in action.
Search Method
This returns search hits that match the query defined in the request.
async search({
q: { query, sort },
pageSize = 10,
offset = 0,
index,
scroll,
}: EsSearchBody) {
try {
return this.client.search({
index,
sort,
query: query,
scroll,
from: offset,
size: pageSize
});
} catch (err) {
throw err;
}
}
Response Structure
After running above query, we will get response similar to this:
{
"took" : 571,
"timed_out" : false,
"_shards" : {
"total" : 3,
"successful" : 3,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "quotes",
"_type" : "_doc",
"_id" : "PZkvy3ABKNkWakGIzsCc",
"_score" : 1.0,
"_source" : {
"author" : "Farrah Gray",
"content" : "Build your own dreams, or someone else will hire you to build theirs.",
"year" : 1984
}
},
{
"_index" : "quotes",
"_type" : "_doc",
"_id" : "QZkwy3ABKNkWakGIbMDh",
"_score" : 1.0,
"_source" : {
"author" : "Abraham Lincoln",
"content" : "It’s not the years in your life that count. It’s the life in your year.",
"year" : 1809
}
},
{
"_index" : "quotes",
"_type" : "_doc",
"_id" : "Bwb72XABdLCfAS5iYiHx",
"_score" : 1.0,
"_source" : {
"author" : "Albert Einstein",
"content" : "Strive not to be a success, but rather to be of value.",
"year" : 1881
}
}
]
}
}
here in above response you can check:
_shards
property: this shows how many shards have been scanned for our data, how many were successful, skipped and failed.hits
object: we can see information about a number of results, and how accurate is the count.max_score
the highest value of score from results. In this query, it is equal to1.0
but in a different query, this value can also be different.
Example 1:
Here is an example of how you can fetch all indexed documents in an ES index. For this we have to use match_all
query.
this.client.search({
index: 'test_index',
query: {
match_all: {}
}
})
Example 2:
let’s assume that we want to show only results containing the word “success”. Now instead of match_all
, we have to use must
clause. Final query will look like below
this.client.search({
index: 'test_index',
query: {
match: {
content: "success"
}
}
})
Example 3:
Let’s display only authors of quotes from before 1900 and including the word success.
this.client.search({
index: "test_index",
query: {
bool: {
must: [
{
range: {
year: {
lte: 1900
}
}
},
{
match: {
content: "success"
}
}
]
}
},
_source: "author"
})
Here is the explanation for above query:
- The
query
parameter indicates query context. - The
bool
,must
,range
andmatch
clauses are used in query context, which means that they are used to score how well each document matches the query. - In the
range
query, we specify that we are looking for documents from before or including 1900. - In the
match
query, we indicate that result should include the word “success” - In
_source
property, we defined which field should return the query. Here, we can pass one field as a string, or an array of fields.
and result will look like below:
(...)
"hits" : [
{
"_index" : "quotes",
"_type" : "_doc",
"_id" : "Bwb72XABdLCfAS5iYiHx",
"_score" : 2.0226655,
"_source" : {
"author" : "Albert Einstein"
}
},
{
"_index" : "quotes",
"_type" : "_doc",
"_id" : "CAb72XABdLCfAS5i7SGR",
"_score" : 1.6451601,
"_source" : {
"author" : "Florence Nightingale"
}
}
]
(...)
Score
In above query results, you can see that each result has own _score
property. It defined the relevance of the document, how well a document matches a query. Higher score value means, the more relevant the document. Each query type would have different score calculation based on query and filter context.
Here we are looking most relevant documents for a phrase: “not to be a success”.
this.client.search({
index: "test_index",
query: {
bool: {
must: [
{
match: {
content: "not to be a success"
}
},
]
}
},
})
By default most relevant document will show on top based on _score
value. But if there will be any sorting behaviour then results will show based on that.
The higher score is, the more accurate document is.
Pagination
When we have any feature, where we need to display a huge amount of data then we need pagination mechanism. In ES it is quite simple. We have to use two properties:
from
: specifies from which record in the index Elasticsearch should start searchingsize
: defines how many results should be returned.
this.client.search({
index: 'test_index',
query: {
match_all: {}
},
from: 0,
size: 10,
})
Sorting
In order to sort by relevance, we need to represent relevance as a value. In Elasticsearch, the relevance score is represented by the floating-point number returned in the search results as the _score
, so the default sort order is _score
descending.
Sometimes, though, We don’t have a meaningful relevance score or we want to sort based on different property after getting relevant search results.
this.client.search({
index: 'test_index',
query: {
match_all: {}
},
sort: {
createdAt: { order: 'asc' }
}
from: 0,
size: 10,
})
Sort Order
Filtering
In above all examples, we saw that how query context works, how well a document matches the query clause.
In a filter context, we get a yes/no answer to: ” Does this document match a query question?”
To find the quote created by Albert Einstein between the year 1800 and 1900, We can use below filtered query:
this.client.search({
index: 'test_index',
query: {
bool: {
filter: [
{
match: { author: "Albert" }
},
{ range:
{
year: {
gte: 1800,
lte: 1900
}
}
}
]
}
},
from: 0,
size: 10,
})
Now the score in response will be equal to 0.0 and result will look like below
"max_score" : 0.0,
"hits" : [
{
"_index" : "quotes",
"_type" : "_doc",
"_id" : "Bwb72XABdLCfAS5iYiHx",
"_score" : 0.0,
"_source" : {
"author" : "Albert Einstein",
"content" : "Strive not to be a success, but rather to be of value.",
"year" : 1881
}
}
]
The benefit of using filter context is caching queries in the “node query cache” that visibly improves performance.
References
Elasticsearch JavaScript Client [8.5] | Elastic
API Reference | Elasticsearch JavaScript Client [8.5] | Elastic
I hope you enjoyed reading about Elastic Search and how it can be used to optimize search and analysis in various applications. If you found this article helpful or have any further questions, please don’t hesitate to reach out to me through the comments.
For more updates and insights on the latest tech trends, be sure to follow me on Twitter or LinkedIn. Thanks for reading, and I look forward to connecting with you on social media.
Twitter: https://twitter.com/geekfarmer_
Linkedin: https://www.linkedin.com/in/geekfarmer