ElasticSearch sort by manual/input order

ElasticSearch is a very powerful open source searching technology. You can host ES by your own or as a AWS managed ES service. AWS ES is easier to configure.

Our customer has requested that they want the results in manual order from search; the manual order here means, if we mention the order using a set of ids in the search query then the results should also be in the same order as the input. We have implemented search functionality using ElasticSearch. Upon Investigation how to do this on ES, I found out that, we can customize the sorting order using script based sorting. Though it was very tricky, I was able to implement after a few attempts. Let me share my experiences on this.

What is script based sorting?

Manual order is not something ElasticSearch by default support so we need to implement using script based sorting. ES allows us to write custom scripts to build our own sorting logic and ES supports several languages for this kind of scripting and we have chosen Painless for our requirement as it is very easy to use and recommended by ES.

Technical Problem statement

ES Version: 5.5

ES Index mapping:

{
"my_videos": {
"mappings": {
"doc": {
"properties": {
"createdAt": {
"type": "date"
},
"description": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"duration": {
"type": "long"
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"lastUpdatedByUserId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"status": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"tags": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"updatedAt": {
"type": "date"
},
"url": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}

We want to apply manual sorting on id field, now let us see how to do this on ES using script based sorting.

Lets dive into implementation

ES Query

GET my_videos/doc/_search
{
"query": {
"bool": {
"should": [{
"match": {
"id.keyword": "01cm7kr0px0tmyzkmsjb55xd3a"
}
},
{
"match": {
"id.keyword": "01cktwwyfnyt9d2nqj9ycwxcme"
}
},
{
"match": {
"id.keyword": "01chyvzv678r1h0y0rx4e4bv8t"
}
}
],
"minimum_should_match": 1
}
},
"sort": [{
"_script": {
"type": "number",
"script": {
"lang": "painless",
"inline": "if(params.scores.containsKey(doc['id.keyword'].value)) { return params.scores[doc['id.keyword'].value];} return 100000;",
"params": {
"scores": {
"01cm7kr0px0tmyzkmsjb55xd3a": 0,
"01cktwwyfnyt9d2nqj9ycwxcme": 1,
"01chyvzv678r1h0y0rx4e4bv8t": 2
}
}
},
"order": "asc"
}
}],
"from": 0,
"size": 10
}

The above query will return the results in the order based on specified scores dictionary/map in _script portion in sort section of the query

The above search query looks very complicated, but don’t worry, I will explain then you will understand easily.

The query section is for matching given ids with existing videos in ES, other than that it is not necessary to understand more about this section.

The sort section we need to understand in detail as that the main focus point this blog. In order to use script based sorting we need to use _script as identifier to tell ES that we want to run the script while sorting the results. We need to specify type field as shown in above query in _script portion to indicate that what is the returning element type, we are returning integer scores here so we have used number as type. The order field is self explanatory, it is the order in which you want sort to happen. The script field contains lang to indicate the language of script, params is contains data that we want to pass to script and inline contains actual code that gets executed on each item to compute the score. Make sure to put validations inside code block else ES will throw failures. So essentially we are hardcoding the scores for each ids in params.scores and using inside inline code block to return these scores for matching ids inside the documents. params.scores can be renamed to anything we want, like params.scoring etc.

Script based sorting is not just limited to this use case but it can also be used for various other use cases.

Also we can include multiple _script portions inside sort section in following way.

{
"sort": [{
"_script": {
...
}
},
{
"_script": {
...
}
}
]
}

Conclusion

We have understood how to use script based sorting in ElasticSearch and most importantly we know how to implement manual sort in ES. Please note latest versions of ES inline field is replaced with source. Try this and let me know if you have faced any issues in implementing this.