Elasticsearch Tutorial Part 4: Searching through Index

Abhishek Bairagi
4 min readDec 29, 2023

--

Based on the previous blogs in this tutorial series, you might know about elasticsearch index and ingesting documents now. And if you followed along you must have an elasticsearch index named ‘library’ with data of 5 books ingested in it. In this blog, we will see how to search through that index.

Search Query

Just like we had to send PUT and POST requests to elasticsearch with the appropriate body for creating the index and ingesting the documents, we will be sending a GET request to the URL localhost:9200/index_name/_search/ with a query to search through elasticsearch index.

Here is one example of a search query:

GET http://localhost:9200/library/_search
{
"query": {
"match": {
"title": "The Catcher in the Rye"
}
}
}

While the realm of search queries in Elasticsearch is vast and encompasses various components, don't fear the complexity we’ll focus on practical and frequently used components.

But before we dive into that, let’s quickly recap our journey thus far. We have an index named ‘library’ with fields title, author, description, published_date, and published_url, and within the index, we’ve data 5 books ingested in it.

Let’s get started with queries now!!

Match Query

Imagine you have a snippet of information about any one of these fields — perhaps an author’s name or the title of a book — and you wish to search the index for relevant results. How do we accomplish this? ……… Enters the ‘match’ query. Just like its name, the match query compares the provided value with the values of the corresponding field in ingested documents and returns the relevant matching documents.

Here is one example:

GET http://localhost:9200/library/_search
{
"query": {
"match": {
"author": "J.D. Salinger"
}
}
}

You will get a result like this:

{
"took": 0,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 2.5226097,
"hits": [
{
"_index": "library",
"_type": "_doc",
"_id": "nyZxtowBZ-ufcDY0rlCB",
"_score": 2.5226097,
"_source": {
"title": "The Catcher in the Rye",
"author": "J.D. Salinger",
"description": "A classic novel about teenage angst.",
"published_date": "1951-07-16",
"url": "https://example.com/book/catcher-in-the-rye"
}
}
]
}
}

Multi-Match Query

Now, let’s explore the scenario where you want to search multiple fields simultaneously. This is where the multi-match query becomes a valuable tool. Without further ado, let’s delve into some examples:

GET http://localhost:9200/library/_search
{
"query": {
"multi_match": {
"query": "dystopian future",
"fields": ["title", "author", "description"]
}
}
}

However, the capabilities of the multi-match query don’t end there. You have the flexibility to assign different weights to individual fields. For instance, if you want to prioritize documents where the title closely matches the query, you can assign a higher weight to the “title” field by appending an exponent sign followed by the desired weight. Here’s an example:

GET http://localhost:9200/library/_search
{
"query": {
"multi_match": {
"query": "dystopian future",
"fields": ["title^2", "author", "description"]
}
}
}

By default the weight for each field is 1.

Filter Query

Let’s delve into the Filter Query. While the multi-match query is excellent for searching and ranking documents based on relevance, the filter query is designed for filtering the results based on certain criteria. A document will only be part of the result set if it satisfies the conditions specified in the filter query.

This can significantly enhance performance when you need to filter data based on certain conditions.

Here’s an example:

GET http://localhost:9200/library/_search
{
"query": {
"bool": {
"filter": [
{ "match": { "author": "J.D. Salinger" }},
{ "range": { "published_date": { "gte": "1950-01-01" }}}
]
}
}
}

In this example, the filter query is embedded within a “bool” query, which allows for the combination of multiple filter conditions. Here, we’re filtering documents where the author is “J.D. Salinger” and the published date is on or after January 1, 1950.

Bool Query

The Bool Query is a versatile and powerful tool that allows you to combine multiple query clauses to construct more complex searches. It operates with two key parameters: “must” for mandatory conditions and “should” for optional conditions. This makes it a go-to choice when you need precise control over your search logic.

Understanding the Bool Query Components:

1. Must Clause: Conditions specified in the “must” clause are mandatory for a document to be considered a match.

2. Should Clause: Conditions within the “should” clause are optional, adding flexibility to your query.

Let’s look at some examples:

Example one: Combining two match queries

{
"query": {
"bool": {
"must": [
{ "match": { "title": "Elasticsearch" }},
{ "match": { "author": "John Doe" }}
]
}
}
}

Example 2- Combining Match with Filter

{
"query": {
"bool": {
"must": [
{ "match": { "title": "Elasticsearch" }}
],
"filter": [
{ "range": { "published_date": { "gte": "2022-01-01" }}}
]
}
}
}

Example 3: Combining must and should

{
"query": {
"bool": {
"should": [
{ "multi_match": { "query": "OpenAI", "fields": ["title", "description"] }},
{ "multi_match": { "query": "GPT-3.5", "fields": ["title", "author"] }}
],
"must": [
{ "match": { "author": "John Doe" }}
]
}
}
}

Here’s an assignment for you, try writing a bool query that combines should, must, and filter. You can check the answer here.

I think this is enough for the day to know about search we will continue our learning in next blog.

--

--

Abhishek Bairagi

NLP Data Scientist. Crafting solutions, exploring possibilities. 🚀✨