Teads Engineering
Published in

Teads Engineering

Practical Elasticsearch Performance Tuning on AWS

Understanding key Elasticsearch optimization features: the empirical way

The Use Case

Step 1 — Don’t Miss the Cache

Latency distribution (simplified)

Step 2 — Forget About Index Warming

Step 3 — Why Not Selecting Relevant Indices?

  • the request targets more than 128 shards
  • the request targets one or more read-only index
  • the primary sort of the query targets an indexed field

Step 4 — Merge Read-only Shards

Latency distribution (simplified)

Step 5 — Optimizing Queries to Recent Data

Optimization results and automation

  • We use Elastic date formats in our queries to leverage the node query cache.
  • As we have a time series database segmented into chronological indices we can enforce the prefilter shard mechanism by setting old indices as read-only explicitly.
  • We apply a force merge on all indices to ensure their shards are reorganized efficiently into one single segment. This was the killing feature avoiding very long queries when previous mechanisms cannot be leveraged.
  • Additionally, we also close indices that are older than a year, which helps in reducing the amount of opened shards and avoids wasting resources for useless data.

Takeaways

Elasticsearch needs good care and attention

The public documentation is right and up to date.

Opendistro isn’t exactly what we have on AWS

Acknowledgments

--

--

200 innovators building the future of digital advertising

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store