Go: ElasticSearch Clients Study Case

Vincent Blanchon
Nov 1, 2019 · 5 min read
Image for post
Image for post
Illustration created for “A Journey With Go”, made from the original Go Gopher, created by Renee French.

Choosing a package to consume ElasticSearch is the first step when setting a project up that uses it. This choice could impact the performance of your application if it is an important part of your workflow. However, there are few options with Go for this case: olivere/elastic is the most famous one, and the elastic/go-elasticsearch is the official client. Let’s review the pros and cons of each package.

Design

The libraries are designed in pretty different ways and, as consumers, we need to feel comfortable with the provided design. Let’s take a query in example and build in with both in order to see the difference. Here is the query:

Image for post
Image for post

The package olivere/elastic provides a full Query DSL (Domain Specific Language) that allows a total abstraction of the real query. Here is the built query:

Image for post
Image for post

And here is the query built with the official client:

Image for post
Image for post

This query is more verbose but more explicit. Building the payload for ElasticSearch is the transformation of the original JSON to a map in Go.

Both the designs can suit different kinds of developers. Developers that do not want to know how ElasticSearch queries work could prefer the first approach, while a developer that would like to control the exact query sent without learning an extra layer could go the second approach. Let’s now review the performance of each of them.

Benchmarks

For this example, I have created a simple structure to store along with a simple query to run against the index. Here is an example of the documents stored:

The query used is the one introduced in the previous section. The benchmarks will cover:

  • building of the query
  • decoding of the results

Here are the results:

olivere/elastic:
name time/op
QuerySearch-4 4.14ms ± 5%

name alloc/op
QuerySearch-4 160kB ± 1%

name allocs/op
QuerySearch-4 1.65k ± 0%
official client:
name time/op
QuerySearch-4 3.88ms ± 5%

name alloc/op
QuerySearch-4 118kB ± 0%

name allocs/op
QuerySearch-4 514 ± 0%

The official client is more efficient since it allocates almost 3 times less and reduces the memory used by 30%. The overall performance is also better and could make more difference in an application that uses it intensively since the garbage collector could run less often with a reduced number of allocations.

Optimizations

The first step to optimize a program is to profile it. Let’s run pprof on olivere/elastic benchmarks and see what could be wrong. Here are the top 5 methods allocating objects:

Showing top 5 nodes out of 47
flat% sum%
29.83% 29.83% reflect.New
29.43% 59.26% encoding/json.(*decodeState).literalStore
8.41% 67.66% encoding/json.(*decodeState).object
8.15% 75.82% encoding/json.Unmarshal
5.61% 81.42% encoding/json.(*RawMessage).UnmarshalJSON

The encoding/json decoding phase takes more than 80% of the allocations. The reflection also comes from this package:

Image for post
Image for post

The first possible optimization could be to change the Json decoder and use easyjson:

Image for post
Image for post

Running the benchmark again will slightly improve the number of allocations (-12%):

olivere/elastic:
name old time/op new time/op delta
QuerySearch-4 4.14ms ± 5% 4.12ms ± 7% ~
name old alloc/op new alloc/op delta
QuerySearch-4 160kB ± 1% 158kB ± 1% -1.57%
name old allocs/op new allocs/op delta
QuerySearch-4 1.65k ± 0% 1.45k ± 0% -12.15%

Now that this part is optimized, we can run pprof again to see if there are more low-hanging fruit. pprof lets us remove the path we have inspected already with the command: go tool pprof -alloc_objects -hide="decodeState|.UnmarshalJSON|Unmarshal|Decode" mem.out:

Image for post
Image for post

The creation of the query now makes the most allocations. The package olivere/elastic provide a function Source() that allows the developer to directly send the query as a string. Here are the new results with this optimization:

name                    time/op
QuerySearch-4 4.12ms ± 5%
OptimizedQuerySearch-4 4.07ms ± 4%

name alloc/op
QuerySearch-4 158kB ± 1%
OptimizedQuerySearch-4 152kB ± 1%

name allocs/op
QuerySearch-4 1.45k ± 0%
OptimizedQuerySearch-4 1.35k ± 0%

Here again we can save some time and allocations, but we are still far away from the official client.

Impact on production

Updating this library for a project that intensively uses ElasticSearch could have a good impact on your production environment. Here is a case for one application that consumes ElasticSearch a lot:

Image for post
Image for post
monitoring after deployment of the new client

In yellow and blue are the monitoring with the new package while the lines in grey represent the old Elasticsearch client. Although I tried to compare the metrics with the same load, the one after deployment was higher than before — it makes the result even better since it can handle more requests with fewer resources.

The impact of this package reduced the number of garbage collection pauses by ~15/20% and results to a lower usage of the CPU by ~20%.

A Journey With Go

A Journey With Go Language Programming

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store