Terms set query in Elasticsearch

Sagar Patel
3 min readOct 13, 2022

--

Elasticsearch Terms set query

What is terms set query?

Terms set query returns document depending on minimal number of exact terms matching into given field.

How terms set query is different than terms query?

The only difference between Terms set query and Terms query is that you can provide the minimal number of terms that must match in order to retrieve certain document.

What is minimum_should_match_field parameter?

A document’s numeric field name whose value should be used as the required minimum number of terms to match in order to return the document.

What is minimum_should_match_script parameter?

A custom script that determines the minimum number of terms that must match in order to return the documents. If you have to dynamically set the number of terms that are required for match then it will be helpful.

Example of minimum_should_match_field

Lets first create the index:

PUT product
{
"mappings": {
"properties": {
"name": {
"type": "keyword"
},
"tags": {
"type": "keyword"
},
"tags_count": {
"type": "long"
}
}
}
}

Index Sample documents:

POST product/_doc/prod1
{
"name":"Iphone 13",
"tags":["apple","iphone","mobile"],
"tags_count":3
}
POST product/_doc/prod2
{
"name":"Iphone 12",
"tags":["apple","iphone"],
"tags_count":2
}
POST product/_doc/prod3
{
"name":"Iphone 11",
"tags":["apple","mobile"],
"tags_count":2
}

Query with minimum_should_match_field parameter:

Usecase 1: Below query will return all the 3 documents as minimum terms match (tags_count) for prod1 is 3, for prod2 is 2 and prod3 is 2 and total 3 terms are passed in query.

POST product/_search
{
"query": {
"terms_set": {
"tags": {
"terms": [ "apple", "iphone", "mobile" ],
"minimum_should_match_field": "tags_count"
}
}
}
}

Usecase1 result:

"hits": [
{
"_index": "product",
"_id": "prod1",
"_score": 1.4010588,
"_source": {
"name": "Iphone 13",
"tags": [
"apple",
"iphone",
"mobile"
],
"tags_count": 3
}
},
{
"_index": "product",
"_id": "prod2",
"_score": 0.7876643,
"_source": {
"name": "Iphone 12",
"tags": [
"apple",
"iphone"
],
"tags_count": 2
}
},
{
"_index": "product",
"_id": "prod3",
"_score": 0.7876643,
"_source": {
"name": "Iphone 11",
"tags": [
"apple",
"mobile"
],
"tags_count": 2
}
}
]

Usecase2: Below query will return only one document as only 2 terms are passed in query which is matching with prod3 only. prod1 will be not return because tags_count value is 3 and total terms passed in query is only 2.

POST product/_search
{
"query": {
"terms_set": {
"tags": {
"terms": [ "apple", "mobile" ],
"minimum_should_match_field": "tags_count"
}
}
}
}

Usecase2 Response:

"hits": [
{
"_index": "product",
"_id": "prod3",
"_score": 0.7876643,
"_source": {
"name": "Iphone 11",
"tags": [
"apple",
"mobile"
],
"tags_count": 2
}
}
]

Example of minimum_should_match_script:

Let’s now look at how the same index data can be retrieved using a dynamic value for minimum should match.

In the example below, the value of the total number of terms supplied in the query will be passed as the minimum should match value. We will use params.num_terms which will calculate the number of terms provided in query. The required number of terms to match cannot exceed params.num_terms, the number of terms provided in the terms field.

POST product/_search
{
"query": {
"terms_set": {
"tags": {
"terms": ["apple","iphone"],
"minimum_should_match_script": {
"source": "params.num_terms"
}
}
}
}
}

Response:

It will be return prod1 and prod2 as minimum_should_match value will be set as 2 as we have passed only 2 terms in query.

"hits": [
{
"_index": "product",
"_id": "prod1",
"_score": 0.7876643,
"_source": {
"name": "Iphone 13",
"tags": [
"apple",
"iphone",
"mobile"
],
"tags_count": 3
}
},
{
"_index": "product",
"_id": "prod2",
"_score": 0.7876643,
"_source": {
"name": "Iphone 12",
"tags": [
"apple",
"iphone"
],
"tags_count": 2
}
}
]

Let’s consider a scenario where you want to take into account the minimum value from tags_count or the number of terms added to the query; in such case, the following query will be helpful:

POST product/_search
{
"query": {
"terms_set": {
"tags": {
"terms": ["apple","iphone"],
"minimum_should_match_script": {
"source": "Math.min(params.num_terms, doc['tags_count'].value)"
}
}
}
}
}

Terms set query Elasticsearch Java client

Below code will be useful for implementing Terms set query using Elasticsearch Java client.

Using new Java API Client

Using Java High Level Client (Deprecated)

--

--

Sagar Patel

Elastic Certified Engineer | Elasticsearch | Java | Spring Boot | Python | OCR | NLP | ML