The WHY of Our ElasticSearch Upgrade

Deepak Varshney
Feb 18 · 4 min read

For the last 6 years, The TopAds Team in Tokopedia has been serving personalized recommendations seamlessly. Our Ads can be classified into 2 broad categories, Browse and Search Ads.

The secret ingredient for all the fame we have gathered till now is ElasticSearch.

We use Elastic Search for fetching all our Ads Content in real-time.

In this 2 part series, you will be able to create a practical and production-ready migration plan with zero downtime. As a big bonus, you will learn how it can improve your ES CPU by 2.5x and query latency by 2x.

Introduction

On a Sunny Day,
I was doing my routine daily work and drinking coffee when suddenly an alert came for the high latency of ElasticSearch Queries.

To accompany it, High CPU usage for ElasticSearch also came because the load was not balanced on data nodes. These alerts were buzzing for quite a long time at regular intervals.

Looking at the improvements ES 7 offers, we took up the challenge to upgrade our ES from version 5.6.3 to 7.7.1.

This blog series will have 2 stations:

  1. WHY did we decide to migrate?
  2. HOW did we migrate and the WOW results we achieved.

This is our first station where we will explain why we actually decided to migrate to Elastic Search 7 and what are the salient features of ES 7.

Our Initial ElasticSearch Setup

We had been using ES 5.6.4 for 4 years and did not upgrade it due to backward-incompatible changes.

Elastic Search 7.7.1 was released in July 2020

Why Upgrade?

  1. ES Query time Improvements.
  2. CPU usage Improvements.
  3. Disk Space usage Improvements
  4. Zero downtime.
  5. Get Nginx Cache Hit Rate graph on datadog to optimize ES query performance by making it more cache-friendly.

These are the amazing new features offered by ES 7.7.1.

  1. New Cluster coordination system
  2. New Circuit Breaker Support
  3. Lucene 8
  4. Faster Retrieval for top hits
  5. Bundled JDK in ElasticSearch distribution

But what actually convinced us to upgrade to ES 7?

Some Results using ARB(also known as Adaptive Replica Selection):

2. New Circuit Breaker Support
- In our team, sometimes, we faced the issue of OOM on some ES nodes under high load due to which some nodes went down and resulted in errors in ES queries.
- With New Circuit Breaker Support in ES 7, the key idea is to avoid OutOfMemoryError by estimating upfront whether a request will push the node over its configured limit and then reject the request instead of falling over.
- With earlier versions of Elasticsearch, it cannot sustain the high workload and run almost immediately out of memory, the real memory circuit breaker in ES 7 pushes back and Elasticsearch can sustain the load. Ref.
- This is the kind of error Elastic Search returns in case the CB trips on a particular request estimating it beforehand so that the OOM issue can be prevented.

{
'error': {
'type': 'circuit_breaking_exception',
'reason': '[parent] Data too large, data for [<http_request>] would be [123848638/118.1mb], which is larger than the limit of [123273216/117.5mb], real usage: [120182112/114.6mb], new bytes reserved: [3666526/3.4mb]',
'bytes_wanted': 123848638,
'bytes_limit': 123273216,
'durability': 'TRANSIENT'
},
'status': 429
}

- It keeps track of the total memory used by the JVM and will reject requests if they would cause the reserved plus actual heap usage to exceed 95% preventing OOM issues.

Sounds Exciting!!! Right??

Stay tuned for the next blog post where we will deep dive into our execution plan and results!!!

Please shower lots of 👏 👏 if you liked our initial Journey!!!

Edit: Next blog is out guys. Please read it here!!!

References

  1. https://www.elastic.co/blog/improving-response-latency-in-elasticsearch-with-adaptive-replica-selection
  2. https://www.elastic.co/blog/improving-node-resiliency-with-the-real-memory-circuit-breaker

Tokopedia Engineering

Story from people who build Tokopedia

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store