Go: ElasticSearch Clients Study Case

Choosing a package to consume ElasticSearch is the first step when setting a project up that uses it. This choice could impact the performance of your application if it is an important part of your workflow. However, there are few options with Go for this case: olivere/elastic is the most famous one, and the elastic/go-elasticsearch is the official client. Let’s review the pros and cons of each package.
Design
The libraries are designed in pretty different ways and, as consumers, we need to feel comfortable with the provided design. Let’s take a query in example and build in with both in order to see the difference. Here is the query:

The package olivere/elastic provides a full Query DSL (Domain Specific Language) that allows a total abstraction of the real query. Here is the built query:

And here is the query built with the official client:

This query is more verbose but more explicit. Building the payload for ElasticSearch is the transformation of the original JSON to a map in Go.
Both the designs can suit different kinds of developers. Developers that do not want to know how ElasticSearch queries work could prefer the first approach, while a developer that would like to control the exact query sent without learning an extra layer could go the second approach. Let’s now review the performance of each of them.
Benchmarks
For this example, I have created a simple structure to store along with a simple query to run against the index. Here is an example of the documents stored:

The query used is the one introduced in the previous section. The benchmarks will cover:
- building of the query
- decoding of the results
Here are the results:
olivere/elastic:
name time/op
QuerySearch-4 4.14ms ± 5%
name alloc/op
QuerySearch-4 160kB ± 1%
name allocs/op
QuerySearch-4 1.65k ± 0%official client:
name time/op
QuerySearch-4 3.88ms ± 5%
name alloc/op
QuerySearch-4 118kB ± 0%
name allocs/op
QuerySearch-4 514 ± 0%
The official client is more efficient since it allocates almost 3 times less and reduces the memory used by 30%. The overall performance is also better and could make more difference in an application that uses it intensively since the garbage collector could run less often with a reduced number of allocations.
Optimizations
The first step to optimize a program is to profile it. Let’s run pprof
on olivere/elastic benchmarks and see what could be wrong. Here are the top 5 methods allocating objects:
Showing top 5 nodes out of 47
flat% sum%
29.83% 29.83% reflect.New
29.43% 59.26% encoding/json.(*decodeState).literalStore
8.41% 67.66% encoding/json.(*decodeState).object
8.15% 75.82% encoding/json.Unmarshal
5.61% 81.42% encoding/json.(*RawMessage).UnmarshalJSON
The encoding/json
decoding phase takes more than 80% of the allocations. The reflection also comes from this package:

The first possible optimization could be to change the Json decoder and use easyjson
:

Running the benchmark again will slightly improve the number of allocations (-12%):
olivere/elastic:
name old time/op new time/op delta
QuerySearch-4 4.14ms ± 5% 4.12ms ± 7% ~name old alloc/op new alloc/op delta
QuerySearch-4 160kB ± 1% 158kB ± 1% -1.57%name old allocs/op new allocs/op delta
QuerySearch-4 1.65k ± 0% 1.45k ± 0% -12.15%
Now that this part is optimized, we can run pprof
again to see if there are more low-hanging fruit. pprof
lets us remove the path we have inspected already with the command: go tool pprof -alloc_objects -hide="decodeState|.UnmarshalJSON|Unmarshal|Decode" mem.out
:

The creation of the query now makes the most allocations. The package olivere/elastic provide a function Source()
that allows the developer to directly send the query as a string. Here are the new results with this optimization:
name time/op
QuerySearch-4 4.12ms ± 5%
OptimizedQuerySearch-4 4.07ms ± 4%
name alloc/op
QuerySearch-4 158kB ± 1%
OptimizedQuerySearch-4 152kB ± 1%
name allocs/op
QuerySearch-4 1.45k ± 0%
OptimizedQuerySearch-4 1.35k ± 0%
Here again we can save some time and allocations, but we are still far away from the official client.
Impact on production
Updating this library for a project that intensively uses ElasticSearch could have a good impact on your production environment. Here is a case for one application that consumes ElasticSearch a lot:

In yellow and blue are the monitoring with the new package while the lines in grey represent the old Elasticsearch client. Although I tried to compare the metrics with the same load, the one after deployment was higher than before — it makes the result even better since it can handle more requests with fewer resources.
The impact of this package reduced the number of garbage collection pauses by ~15/20% and results to a lower usage of the CPU by ~20%.