Introduction to MongoDB indexes

Published in

Codewords

4 min readJan 27, 2021

MongoDB indexing is similar to creating an index of a book. Whenever we need to find some topic in our book we first open the index page and check for a particular topic, and it’s page number. MongoDB performs the same function when indexing is applied to any collection. It first checks for the index which points to the document and returns the desired document.

The index in MongoDB increases the speed of finding documents.

How do indexes work?

First, let’s understand how we declare index in MongoDB.

DB.<collectionName>.createIndex({field:value})

Here the field is the name of the field which will be indexed and the value is the order of indexing that could be 1, -1, or text.

The value defines the type of index, 1 is for ascending, -1 for descending, and text for searching.

How do indexes work under the hood?

Imagine we have a collection of books where we have a title, author, year, and other fields.

Let’s say we want to find books between the years 1995 to 2000.

If the ‘year’ field is not indexed, MongoDB will scan the entire collection to find a matching document. This kind of scan is called COLLSCAN (This is called Table scan in Relational databases).

If the ‘year’ field is indexed, then MongoDB performs a IXSCAN which first sorts the whole collection according to the ordering provided while applying the index and then scans documents which are falling under the year 1995 to 2000.

COLLSCAN scans the entire collection in order to find documents that fall under query criteria.
IXSCAN scans only a set of documents that fall under the index key.

Index Types

MongoDB provides a number of different index types to support specific types of data and queries.

Single Field - user-defined ascending/descending indexes on a single field of a document. For single-field index and sort operations sorting order does not affect because MongoDB can traverse both directions. For example, in our case, we created an index for the year field only.
Compound Index - user-defined index for more than one field. For instance, if the compound-index consists of userId and year, the index first sorts by userId then within each userId value, sorts by year.
Multi key Index:- Multi key indexes allow queries to select documents that contain arrays by matching on element or elements of the arrays. MongoDB automatically determines whether to create a multi key index if the indexed field contains an array value; you do not need to explicitly specify the multi key type.
Geospatial Index - MongoDB provides two spatial indexes: 2d indexes that use planar geometry when returning results and 2dsphere indexes that use spherical geometry to return results.
Text Indexes - MongoDB provides a text index type that supports searching for string content in a collection. To index a field that contains a string or an array of string elements, include the field and specify the string literal "text" in the index document
Hashed Indexes - a hashed index type, which indexes the hash of the value of a field. Hashed indexes use a hashing function to compute the hash of the value of the index field. [1] The hashing function collapses embedded documents and computes the hash for the entire value but does not support multi-key (i.e. arrays) indexes.

How the Index can optimize search operations?

When Index is created for any field and searched it uses IXSCAN which narrows down the dataset it will have to scan. This is called the Index Scan.

Let’s have a visual representation of the year index and its mapping.

We can see the explanation of any query using MongoDB's explain method. First, let’s have an explanation of both IXNSCAN and COLLSCAN. In our case, if the query criteria are for the year then MongoDB will do IXNSCAN and if it is on other fields then it will use COLLSCAN.

db.books.find({ year: { $gte: 1995, $lte: 2000 }}).sort({year: 1}).explain(“executionStats”)

If we run this query then we can see that it is an IXNSCAN and also we can see a comparison if this would be a COLLSCAN and the ratio of the object scanned v/s document returned is 1.

db.books.find({ title: 'Downtown'}).explain("executionStats")

This is a query on a field that does not have an index, and we can see this is a COLLSCAN and the ratio of scanned document v/s returned document is 45,230 which is not good.

COLLSCAN scans the entire collection.