Getting started with MongoDB explain()

Guy Harrison
MongoDB Performance Tuning
5 min readNov 12, 2017

The explain() method allows you to examine query plans. It’s an essential tool for tuning MongoDB performance

For almost all operations, there will be more than one way for MongoDB to retrieve the documents required. When MongoDB parses a statement, it must decide which approach will be fastest. The process of determining this “optimal” path to the data is referred to as query optimisation.

Basics of query optimization

For instance, consider the following query:

There are indexes on FirstName, LastName, Phone and dob. This gives MongoDB the following choices for resolving the query:

  1. Scan the entire collection looking for documents that match all the condition, then sort those documents by dob
  2. Use the Index on FirstName to find all the "RUTH"s, then filter out those documents based on LastName and Phone, then sort the remainder on dob.
  3. Use the Index on LastName to find all the "MARTINEZ"s, then filter out those documents based on FirstName and Phone, then sort the remainder on dob.
  4. Use the Index on Phone to find all documents with a matching phone number. Then eliminate any who are not RUTH MARTINEZ, then sort by dob.
  5. Use the Index on dob to sort the documents in dob order, and eliminate and documents that don't match the query criteria.

The MongoDB optimizer decides which of these approaches to take. The explain() method reveals the query optimizer's decision and - in some cases - lets you examine it's reasoning.

Getting started with explain()

To See the explain output we use the explain() method of the collection object and pass a find(),update(), insert(), drop() or aggregate() to that method. For instance to explain the query above we could do this:

The resulting explainDoc object is a cursor that returns a JSON document (after you call the next() method). The bit that is most important initially is the winningPlan section, which we can extract like this:

It’s still pretty complex — and I removed some stuff to simplify it. However, you see it contains multiple stages of query execution with the input to each stage (eg the previous step) nested as inputStage. So have to read the JSON from the inside-out to get the plan. If you want, you can use this simple function to print out the steps in the order then are execution (this works for most plans, but not for explains from sharded collections):

This script prints the execution plan in a very succinct format. Here’s an explanation of each step:

Step Explanation IXSCAN Phone_1 : Used the index to find documents with a matching value for Phone FETCH : Filtered out documents based on the Firstname and Lastname criteria SORT_KEY_GENERATOR : Emitted dob attributes for the next step SORT : sorted documents on the dob PROJECTION : Pushed address and dob into the output stream

There are a lot of different types of steps in explain output and there is no one place I know of to find all of the step descriptions (though https://docs.mongodb.com/manual/reference/explain-results/ is a good start). Probably the three most interesting steps are:

COLLSCAN — which means the document was scanned without an index.

IXSCAN — the use of an index to find documents

SORT — the sorting of documents without an index.

Alternate plans

Explain can tell you not just which plan was used, but which other plans were rejected. The rejected plans can be found within the array rejectedPlans within the queryPlanner section. So here is an example of looking at a rejected plan:

Execution Statistics

If you pass the argument executionStats to explain, then explain will actually run the statement and report on the performance of each step in the plan:

The execution statistics are included in the executionStages section of the resulting plan document:

The executionSteps subdocument contains overall execution statistics - such as executionTimeMillis - as well as an annotated execution plan in the executionStages document. executionStages is structured just like winningPlan, but it has statistics for each step. There are a lot of statistics, but perhaps the most significant ones are:

  • executionTimeMillisEstimate Estimate of the ms spent in each step.
  • keysExamined Number of index keys read
  • docsExamined Number of documents read by this step

It’s hard to read the execution stats — so I wrote this little function to print out the steps and key statistics in the same format as the quick_explain script:

We’ll use this function in the next section to tune a MongoDB query.

Using explain to tune a query

Now that we’ve learned how to use explain, let’s run through a short example of how to use it to tune a query. Here is the explain for the query we want to tune:

Let’s use our little function to extract the executionStats:

The COLLSCAN step comes first, and examines 120,477 documents. It only takes 902ms, but still maybe we can do better. There's also a SORT in there, and I'd like to see if I can avoid the sort using an index. So let's create an index that has the attributes from the filter clause (mailbox and subfolder), and the attribute from the sort operation (headers.Date). Here's the new index:

Now when we issue explain('executionStats') our output looks like this:

Using the index the query now returns almost instantaneously and the number of documents (keys) examined has reduced to 42. We’ve reduced the amount of data accessed by 97% and improved execution time by infinity :-). Note also that there is no longer and SORT step - MongoDB was able to use the index to return documents in sorted order without an explicit sort.

Conclusion

Explain() is one of the key “tools of the trade” that anyone wanting to tune MongoDB queries should learn. We have just scratched the surface in this blog post but this introduction should at least give you enough information to confidently examine the plans for your MongoDB queries. In a subsequent post, I’ll talk about how to gather cached execution plans and how to use explain() in conjunction with the profiler.

Using explain in dbKoda

dbKoda is our open source IDE for MongoDb. It contains lots of great stuff, but most relevantly to this article, it contains a graphical explain plan display that clearly illustrates the execution plan for your query and described how each step is executed in plain english. Get dbKoda for free at http://dbkoda.com/download.

Originally published at www.dbkoda.com on November 12, 2017.

--

--