Myntra Engineering
Published in

Myntra Engineering

Adaptive Throttling of Indexing for Improved Query Responsiveness

How to reactively control indexing rate to preserve query performance for bursty traffic

Background

Present Setup using Solr and Redis clusters to serve products at Myntra

Motivation

Architecture

Architecture with Throttling Engine, Permitted Rate Store and Enhanced Spout for Indexing Throttling
  • The basic idea is to gather read performance metrics from data stores, calculate the permitted rate of updates and then enforce that rate in the indexing system.
  • The new system incorporates a throttling engine which polls the data stores periodically to get the latest response time or other similar metrics. In our instance the throttling engine queries Solr to get median response times for the cluster.
  • Once we have the read performance metrics, we calculate the permitted rate of updates a given datastore can process without adversely affecting its read performance. This rate is pushed to a permitted rate store at regular intervals.
  • The Kafka Spout implementation is enhanced to enable throttling. The various spout instances read the permitted rate from permitted rate store, and regulates the emission of tuples to maintain this rate.
  • The permitted rate is allocated on the basis of amount of pending messages and priority for different kinds of updates.
  • This ensures that the read performance is not degraded during heavy indexing by delaying the processing of updates. In extreme cases like prolonged periods of very low permitted rate manual intervention is desired. Thus, alerts are setup on indexing rates and response times.

Deep Dive

1. In-house Limit Algorithm

  • Acceptable average and maximum response times are determined for the system using historical data.
  • The current response time is used, along with average and maximum values for the system, to calculate a load factor between 0 and 1.
  • This load factor is then converted into a percentage of tuples that should be emitted.

2. AIMD Limit

3. Gradient2 Limit

4. TCP Vegas Limit

Performance Test Results

AIMD Limit Algorithm — 60% Tuples Processed — 21 mins in test setup — RTTs crossed 65ms at peak
Gradient2 Limit Algorithm — 40% Tuples Processed — 23 mins in test setup — RTTs crossed 80ms at peak
TCP Vegas Algorithm — 100% Tuples Processed — 16 mins in test setup — RTTs reaching 60ms at peak
In-house Algorithm — 100%Tuples Processed — 20 mins in test setup — RTTs reaching 70ms at peak
Bursty Write Traffic before Throttling vs Controlled Flow after Throttling

Impact

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store