Advanced MongoDB Performance Tuning

The journey of refactoring MongoDB schema and queries to achieve a 40X performance boost

Sankarganesh Eswaran
idealo Tech Blog

--

Introduction

As depicted in the below diagram, our system has to balance between the writes done by the Event Processing apps and the reads done by the API Service apps. Whenever the Event Processing app needs to write more than average (2K events/sec), the API app had huge disruptions and response timeouts (P99>500ms) breaking our SLA.

A brief overview of the architecture. Event Processor cannot increase the throughput (2K) without disrupting the API’s reading performance.

This article explains how we identified & fixed the bottle-neck in MongoDB and achieved an incredible performance boost.

TL;DR

In the most performance gain order,
1. Replaced all Simple MongoDB indexes with
Compound indexes
2. Compressed the biggest JSON field in the MongoDB document
3. Adopted Kafka batching +
MongoDB BulkOperations for event processing
4. Controlled deletes with Mark-Sweep technique

1. Index optimization

We revisited the existing MongoDB index after we verified that the poor API response time is caused by increasing number of slow queries.

Slow Queries: Queries that takes more than 100ms to complete

Fortunately, Idealo had an incredible MongoDB profiler tool that monitors all our MongoDB replica-set which we can use for our investigation.

To know more about the tool built and OpenSourced by idealo,

But one can also use query explain and inspect the executionStats db.collection.find("{query}".explain()

The MongoDB profiler results revealed our existing indexes are sub-optimal. The totalDocsExamined to nReturned ratio 5:1
1. MongoDB reads 4 times more documents (from disk) than it actually returns
2. 80% of the documents are filtered only after reading from disk
3. This is a huge factor at production scale where everyday 21 Million Documents are examined from disk but only 4 Million of them are actually returned

So we replaced all existing simple indexes with compound indexes.

Not all fields in the query filter have a matching index.
All fields in the query filter have a matching index. Also the query is modified to match the compound index.

The new QueryPlan showed totalDocsExamined to nReturned ratio is 1:1. i.e: MongoDB returns all the Doc it reads from disk because all the necessary filtering is already covered by the new compound indexes .

QueryPlan“planSummary”: “IXSCAN { productId: 1, siteId: 1, segments: 1}, IXSCAN { parentProductId: 1, siteId: 1, segments: 1}”,
“inputStage”: { . . . ,
“nReturned”: 20, “docsExamined”:20, . .}

We rolled-out the app with new indexes and the results were dramatic.

99% of Slow Queries are gone!

Left: The bigger the green circle, the larger the occurrence of Slow Queries.
The higher the red circle, the slower the Slow Queries.
Right: 99% reduction on the occurrence while 1% rest under <~200ms.

2. Compressing the biggest JSON Field

While the index optimisation removed 99% of Slow Queries, still there were occasional API response delays & timeouts. Our performance test revealed an interesting finding.

P99 latency is 4X more than P90

Only a very few set of requests, repeatedly fallen in >P95 bucket. Deeper look on those JSON documents revealed the size of the Documents are huge.

Facts about the schema design

  1. The Document is self contained with all possible fields within it. Thus no more relation & joins with other collections when reading
  2. A single JSON Document on average had 3000+ lines and 90% of it belongs to a single field offerList
  3. MongoDB profiler tool showed a single query transferred up to 8MB! The slow queries alone accounted for 30GB of data transfer from MongoDB.
JSON Document on average had 3000+ lines. More the offer a product has, bigger the offerList array.

We realised that a JSON Document this big cannot move any faster. We stretched the “self-contained” Document DB nature to its limit.

Luckily, our queries did not filter based on this largest field. For a Collection with 13Million Documents, this single largest field is an elephant in the room. So we pondered on an app level Compression & Decompression of this largest field.

PoC on App level compression

Compression done before writing & Decompression done after reading, all in app-level.

Our next experiment was to pick one general purpose compression algorithm that is better suited for our data. So we wrote a small program to compare GZip vs LZ4_Fast vs Zstd with 3 different input size for the experiment.
Small: 103.25Kb | Medium: 4.02Mb |Large: 12.5Mb

Left: Compression ratio, higher the better. Right: Decompression time taken, lower the better

Our experiment resulted in favour of zstd which gave the best compression ratio with second best decompression speed.

Schema with Compression

{
"_id": "{...}",
"offerListId" : "XXXXXXXXXXX",
"parentProductId" : NumberLong(5380287),
"olBytes": BinData(0, "KLUv/WCRNv1KAFp23BU2MIu2AUBeWCiw......."),
"rawLen" : NumberInt(14225)
}

olBytes replaces the older plain JSON field offerList.Also note a new field rawLen is added to store the original length of the plain JSON. This will be used by the algorithm to allocate accurate buffer while decompressing, thus significantly improves the decompression time.

Repeating the performance test with compression this time, showed promising result.

Compressed version beats plain version by 2X-3X faster in all percentiles.

We rolled-out the compression improvement live and the results were marvellous.

  1. The normal event processing throughput increased from 2000 events/sec to 9000 events/sec (4X).
  2. Performance test throughput reached up to 80,000 events/sec what was impossible before. We witnessed the beast mode in action!
  3. All database queries (P99) are now completed in < 15ms.
  4. Reduce network I/O and much reduced JVM heap (4X less) usage by apps
99.96% of Slow Queries are gone with compression. The first 2 images are in daily mode & the last one in weekly mode.
API Response time P99 dropped to a constant <100ms
Reduce JVM heap usage by apps
Reduced network traffic between apps & MongoDB
DB queries completing in < 15ms

Note:
If you wonder why we did app-level compression and not used MongoDB’s in-built compression, here is why
1. MongoDB’s compression is only for the data at rest to save disk space. When compressed collection data is accessed from disk, it will be decompressed in memory/cache.
2. In addition, compressing a single field in a Document is not supported

3. Kafka batching + MongoDB BulkOperations

This section explains how the event processing app adopted batching to minimise the interaction with MongoDB.

BulkOperations bulkOps 
= mongoTemplate.bulkOps(BulkOperations.BulkMode.UNORDERED);
bulkOps.upsert(upsertList);
bulkOps.updateMulti(deletList);
return bulkOps.execute();

With Batching, what took 100 roundtrips to MongoDB to process 100 events, now took only one. Whats more? It did the same job but with 2X less apps (replicas). Batching improved the resource utilisation a lot better.

4. Controlled Deletes by Mark-Sweep

Deleting large number of Documents in very short time on a larger collection (13Million Documents) showed a bit of negative reading performance . This could be due to an expensive index calculations.

The incoming event represents an Update or Delete action and we processed as and then it arrives. We had no real control to throttle and spread the deletion rates. So we decided to have one.

Implementing the Mark-Sweep

In the Event Processing app, upon receiving a Delete event,

  1. Don’t delete the relevant Document
  2. Instead append a new delete marker field. ex: {‘del’: true}
  3. By a background scheduler, Delete the ‘marked’ Documents with limits using BulkMode
Left: Constant delete rate by scheduler irrespective of the count of incoming delete events. Right: Growing and shrinking ‘Delete’ marked Documents

Conclusion

  1. Compound indexes removed 99% of the Slow Queries. With better indexes, disk access is reduced to the absolute need.
  2. Compressing the largest field removed 99.9% of the Slow Queries and gave an incredible all-round performance boost.
  3. Batching largely improved the resource utilisation. It reduced
    the App ←to→ MongoDB interactions by 100X (Batch size)
  4. Mark-Sweep Delete approach improved the stability of the large MongoDB collection.

If you found this article useful, give me a high five 👏🏻 so others can find it too, and share it with your friends.

Do you love agile product development? Have a look at our vacancies

Thanks for reading!

--

--