A Simple Explanation of ElasticSearch Query DSL
ElasticSearch is a distributed search engine developed on top of Lucene. It provides a number of search and analysis capabilities on your data over a RESTful interface.
The query DSL for ElasticSearch is a bit different to Lucene and SQL and may take a little getting used to, but it is quite intuitive once you get the hang of it.
Single clause query
ElasticSearch queries are submitted as formatted JSON in the body of an HTTP request. A simple query looks like this:
POST departments/employees/_search
{
“query”: {
“match” : { “name” : ”John” }
}
}
The top level field in an ElasticSearch query is always query
, with the query type you want to execute one level under. In this case, we are executing a match query, which will find all the documents that have a field name containing the term John
in it.
Query URI
ElasticSearch also lets us narrow down the search pool using the URI. The URI I queried was departments/employees/_search
. This means I am querying the departments
index and searching only on the index type employees
. You can specify multiple indices and types in the query URI by delimiting with commas.
A search URI of departments,teams/employees,customers/_search
will search the types users
and customers
in the departments
and teams
indices.
To search over all types or indices, leave the field blank or use _all
.
E.g. _all/employees,customers/_search
Searches over all indices on types employees
and customer
E.g. departments/_search
Searches the department index on all types
E.g. _search
Searches every index on all types
Multiple clause query
That is all well and good, but what if we want to do a more complicated query with more search parameters? Say we wanted to find an employee that have the first name John
or Mark
and have a job title of developer
. The following example shows how that can be done.
POST departments/employees/_search
{
“query”: {
“bool”: {
“should”: [
{“match”: {“name”: “John”}},
{“match”: {“name”: “Mark”}}
],
“minimum_should_match”:1,
“must”:{
{“match”: {“title”: “developer”}}
}
}
}
}
To send multiple match queries, we use a top level bool query. A document will only match the query and be returned if all the clauses in the must
clause are satisfied and at least one of the clauses in the should
clause are satisfied (as we have set minimum_should_match
equal to 1).
One thing to note is that if a document matches multiple clauses in a should clause, it will be scored higher by ElasticSearch as it would be considered a more relevant search result. While impossible in this example for an employee to have both a name of Mark and John, if the query contained a should
clause like this:
“should”: [
{“match”: {“name”: “John”}},
{“match”: {“title”: “developer”}}
],
“minimum_should_match”: 1
An employee that has either a title of a developer or the name John would be matched and returned by ElasticSearch. Any Johns that are developers would be returned with a higher score than other employees that only match one of the clauses.
The examples shown in this article are fairly simple. There are a huge number of ways to customize your queries to make ElasticSearch do what you want. I suggest going out and spinning up a free trial on elastic.io to get a feel for what ElasticSearch can do; when used correctly ElasticSearch is a very powerful tool.