Elasticsearch: Matching documents having a field of multiple choices
Since Elasticsearch handles both single and array values without difference (this feature is useful to giving aliases), we can’t get desired results intuitively for a field of multiple choices. In this post, I’d like to introduce two solutions to address the problem using Elasticsearch 5.5.
TL;DR
- Bool Query with “must” and “must_not” occurrence types
- Script Query with a built-in “painless” of General-purpose language
The Mapping and Documents
Before jumping into the weird query world, let’s prepare a mapping and sample data. I suppose you want to create indices of survey results including a field of multiple choices: “A”, “B”, and “C”.
Here is a mapping for the answers. Using “keyword” type in the “choice” property is important to disable the analysis for full-text search and reduce a size of indices.
Now we can insert documents. Elasticsearch accepts an array value like a data structure “Set” (the order of values is ignored). It’s useful to represent a data of multiple choices.
The Problem
This query matches all “1”, “2”, and “3”. I hope the result should be “3”.
Let’s see how to build a query to get results we want.
Solution #1: Bool Query
Solution #2: Script Query
If you want to match answers that have two choices including “B”, change a condition of length from 1 to 2.
This query will match only “2”.
Conclusion
Bool Query and Script Query can be used to build a query that returns documents including a field of multiple choices. Script Query is flexisble than Bool Query, but it has a possibility of performance issue in case of a tons of documents.
