Getting started with MongoDB explain()
The explain() method allows you to examine query plans. It’s an essential tool for tuning MongoDB performance
For almost all operations, there will be more than one way for MongoDB to retrieve the documents required. When MongoDB parses a statement, it must decide which approach will be fastest. The process of determining this “optimal” path to the data is referred to as query optimisation.
Basics of query optimization
For instance, consider the following query:
There are indexes on FirstName
, LastName
, Phone
and dob
. This gives MongoDB the following choices for resolving the query:
- Scan the entire collection looking for documents that match all the condition, then sort those documents by
dob
- Use the Index on
FirstName
to find all the "RUTH"s, then filter out those documents based onLastName
andPhone
, then sort the remainder ondob
. - Use the Index on
LastName
to find all the "MARTINEZ"s, then filter out those documents based onFirstName
andPhone
, then sort the remainder ondob
. - Use the Index on
Phone
to find all documents with a matching phone number. Then eliminate any who are not RUTH MARTINEZ, then sort by dob. - Use the Index on
dob
to sort the documents indob
order, and eliminate and documents that don't match the query criteria.
The MongoDB optimizer decides which of these approaches to take. The explain()
method reveals the query optimizer's decision and - in some cases - lets you examine it's reasoning.
Getting started with explain()
To See the explain output we use the explain() method of the collection object and pass a find()
,update()
, insert()
, drop()
or aggregate()
to that method. For instance to explain the query above we could do this:
The resulting explainDoc
object is a cursor that returns a JSON document (after you call the next()
method). The bit that is most important initially is the winningPlan
section, which we can extract like this:
It’s still pretty complex — and I removed some stuff to simplify it. However, you see it contains multiple stages of query execution with the input to each stage (eg the previous step) nested as inputStage
. So have to read the JSON from the inside-out to get the plan. If you want, you can use this simple function to print out the steps in the order then are execution (this works for most plans, but not for explains from sharded collections):
This script prints the execution plan in a very succinct format. Here’s an explanation of each step:
Step Explanation IXSCAN Phone_1
: Used the index to find documents with a matching value for Phone
FETCH
: Filtered out documents based on the Firstname
and Lastname
criteria SORT_KEY_GENERATOR
: Emitted dob attributes for the next step SORT
: sorted documents on the dob
PROJECTION
: Pushed address
and dob
into the output stream
There are a lot of different types of steps in explain output and there is no one place I know of to find all of the step descriptions (though https://docs.mongodb.com/manual/reference/explain-results/ is a good start). Probably the three most interesting steps are:
COLLSCAN — which means the document was scanned without an index.
IXSCAN — the use of an index to find documents
SORT — the sorting of documents without an index.
Alternate plans
Explain can tell you not just which plan was used, but which other plans were rejected. The rejected plans can be found within the array rejectedPlans
within the queryPlanner
section. So here is an example of looking at a rejected plan:
Execution Statistics
If you pass the argument executionStats
to explain, then explain will actually run the statement and report on the performance of each step in the plan:
The execution statistics are included in the executionStages
section of the resulting plan document:
The executionSteps
subdocument contains overall execution statistics - such as executionTimeMillis
- as well as an annotated execution plan in the executionStages
document. executionStages
is structured just like winningPlan
, but it has statistics for each step. There are a lot of statistics, but perhaps the most significant ones are:
executionTimeMillisEstimate
Estimate of the ms spent in each step.keysExamined
Number of index keys readdocsExamined
Number of documents read by this step
It’s hard to read the execution stats — so I wrote this little function to print out the steps and key statistics in the same format as the quick_explain
script:
We’ll use this function in the next section to tune a MongoDB query.
Using explain to tune a query
Now that we’ve learned how to use explain, let’s run through a short example of how to use it to tune a query. Here is the explain for the query we want to tune:
Let’s use our little function to extract the executionStats:
The COLLSCAN
step comes first, and examines 120,477 documents. It only takes 902ms, but still maybe we can do better. There's also a SORT
in there, and I'd like to see if I can avoid the sort using an index. So let's create an index that has the attributes from the filter clause (mailbox
and subfolder
), and the attribute from the sort operation (headers.Date
). Here's the new index:
Now when we issue explain('executionStats')
our output looks like this:
Using the index the query now returns almost instantaneously and the number of documents (keys) examined has reduced to 42. We’ve reduced the amount of data accessed by 97% and improved execution time by infinity :-). Note also that there is no longer and SORT
step - MongoDB was able to use the index to return documents in sorted order without an explicit sort.
Conclusion
Explain() is one of the key “tools of the trade” that anyone wanting to tune MongoDB queries should learn. We have just scratched the surface in this blog post but this introduction should at least give you enough information to confidently examine the plans for your MongoDB queries. In a subsequent post, I’ll talk about how to gather cached execution plans and how to use explain() in conjunction with the profiler.
Using explain in dbKoda
dbKoda is our open source IDE for MongoDb. It contains lots of great stuff, but most relevantly to this article, it contains a graphical explain plan display that clearly illustrates the execution plan for your query and described how each step is executed in plain english. Get dbKoda for free at http://dbkoda.com/download.
Originally published at www.dbkoda.com on November 12, 2017.