MongoDB Advance Operations

The pdp
7 min readMay 25, 2019

--

the third part of MongoDB tutorial

So how you doing all folks. this one is our third and last part of this journey in this part we go through some advance operation of MongoDB. In the last blog, we cover all the basics of MongoDB. So let’s start this awesome journey.

Indexing in MongoDB

Indexes in SQL programming are nothing but a special data structure used to easily and quickly locate the record in a given table of the database without being required to traverse through each and every record of the table. Indexes are easily generated using one or more columns of a given table. As a note, the data structure used by an index is a Binary Tree (B-Tree).

In MongoDB, indexes play a vital role in efficiently execution of the queries. Basically, if no index is defined in MongoDB, then it has to scan every document of a given collection. Hence, MongoDB uses index to reduce the number of documents to be scanned in a given collection. In fact, MongoDB’s index is more or less similar to the indexes used in other relational databases.

The fact is that the MongoDB defines the indexes at the collection level and supports indexing on any fields in a MongoDB collection.

Default Index

Mongodb provides a default index named _id which acts as a primary key to access any document in a collection. This _id index basically avoids the insertion of 2 documents with the same value for the _id field.

Creating an Index using createIndex()

db.collection_name.createIndex({field : value })

o create an index on the field regNo for an employee collection, run the command db.employee.createIndex({regNo : 1})

Following will be the output upon running the above command :

{
"createdCollectionAutomatically": false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}

We can also create Index on multiple fields by running a single command. The command will be :db.employee.createIndex({regNo : 1, address : -1})

so now we have to go deep for types of index: -

  1. Single field index — Used to create an index on a single field and it can be a user-defined as well apart from the default _id one.
  2. Compound index — MongoDB supports the user-defined indexes on multiple fields.
  3. Multi key index — MongoDB uses multi key indexes basically to store the arrays. MongoDB creates a separate index for each element in an array. MongoDB intelligently identifies to create a multi key index if the index contains elements from an array.
  4. Geospatial index — Used to support the queries required for the geospatial coordinate data.
  5. Text index — This index is used to search for a string content in a collection
  6. Hashed index — Used for hash based Sharding

Sorting in MongoDB

Sorting the data in any database is one of the vital operations in any database management system. MongoDB provides sort() function in order to sort the data in a collection. Sort function in MongoDB accepts a list of values and an integer value 1 or -1 which states whether the collection to be sorted in ascending (1) or descending (-1) order.

Syntax for sort function:

db.collection_name.find().sort({KEY : 1})

Consider a collection named employee containing 3 records. Let us now see how the data can be sorted using the sort() function in MongoDB.

To list down all the data in a collection, use the find() command. To create the same sample data as used here in the example, create a collection named employee and insert 3 documents with one field name and some value for it. In the next step we will run the sort command on this sample data

> db.employee.find()output --{"_id": objectId("89biu089y8benb3380kjd9y8"), "name": "employee 1"}
{"_id": objectId("89biu089yj39nb3380kjnkh9"), "name": "employee 2"}
{"_id": objectId("89biu089y8benjkhouh8939e"), "name": "employee 3"}

Now run the below query to sort the data by the name field in ascending order :

> db.employee.find().sort({name : 1})output --{"_id": objectId("89biu089y8benb3380kjd9y8"), "name": "employee 1"}
{"_id": objectId("89biu089yj39nb3380kjnkh9"), "name": "employee 2"}
{"_id": objectId("89biu089y8benjkhouh8939e"), "name": "employee 3"}

Now run the below query to sort the data by the name field in descending order :

db.student.find().sort({name : -1})output --{"_id": objectId("89biu089y8benjkhouh8939e"), "name": "employee 3"}
{"_id": objectId("89biu089yj39nb3380kjnkh9"), "name": "employee 2"}
{"_id": objectId("89biu089y8benb3380kjd9y8"), "name": "employee 1"}

Aggregation in MongoDB

Aggregation in MongoDB is nothing but an operation used to process the data that returns the computed results. Aggregation basically groups the data from multiple documents and operates in many ways on those grouped data in order to return one combined result. In sql count(*) and with group by is an equivalent of MongoDB aggregation.

Aggregate function groups the records in a collection, and can be used to provide total number(sum), average, minimum, maximum etc out of the group selected.

In order to perform the aggregate function in MongoDB, aggregate () is the function to be used. Following is the syntax for aggregation :

db.collection_name.aggregate(aggregate_operation)

Let us now see how to make use of the aggregate function in MongoDB. Consider a collection named books with the data as shown below :

NOTE : Create a collection named books, with fieldnames — title, price and type. Use db.books.find() to list down the contents of the collection.

Now, from the above collection, let us use the aggregate function to group the books which are of the type ebook and online. Following command will be used :

db.books.aggregate([{$group : {_id: "$type", category: {$sum : 1}}}])output --
("_id": "ebook", "category": 2)
("_id": "online", "category": 2)

The above aggregate function will give result, which says there are 2 records of the type ebook and 2 records of the type online. So the above aggregation command, has grouped our collection data, based on their type.

and there many types of function like —

  1. $sum — Summates the defined values from all the documents in a collection
  2. $avg — Calculates the average values from all the documents in a collection
  3. $min — Return the minimum of all values of documents in a collection
  4. $max — Return the maximum of all values of documents in a collection
  5. $addToSet — Inserts values to an array but no duplicates in the resulting document
  6. $push — Inserts values to an array in the resulting document
  7. $first — Returns the first document from the source document
  8. $last — Returns the last document from the source document

Data Backup and Restoration in MongoDB

Data backup is one of the vital and highly required process for any database management system. The primary reason being is that it is difficult to anticipate how and when the data can be lost. So it is an ideal and best practice that whenever a database is setup, we need to ensure that it has a provision for the data backup in case of any loss events happens.

A backup is nothing but the copy of data from the database which helps in reusing the database in case of any catastrophic event happens to the data.

Data Backup

In order to perform a data backup process in MongoDB, the command mongodump should be used. This command will simply dump all the data stored into a dump directory of the MongoDB. It also helps to backup the data from the remote servers as well.

In order to properly perform the data backup, follow the below-mentioned instructions:

  1. Start the MongoDB server with the command mongod
  2. Start the MongoDB client with the command mongo in a new command prompt.
  3. Switch to the required collection and run the command mongodump.

Mongodump basically reads the data from the database and creates a BSON file in which the data is dumped. Mongodump writes only documents from the database. The resultant backup of the data will be of the space efficient. The backup of the data will be stored under the mongodb’s bin\dump folder.

Also, there is one disadvantage of using mongodump which will have some performance impact when the data of a collection is huge than the available system memory.

Data Restore

Now let us learn how to restore the backup data in mongodb. Data backup is basically used to reconstruct the data in case of a loss event. MongoDB helps to restore the backup data through its one of the utility tools called mongorestore which in turn is a command as well.

Sharding in MongoDB

The data growth which is really difficult to manage in a single system, it is an ideal way to have a cluster containing the replica set of the data. Hence, a horizontal scaling of the data is required and sharding does this in MongoDB. Sharding in simple just adds more machines to handle the sudden or rapid growth of data in an application.

Need for Sharding in MongoDB :

  1. Vertical scaling is too scaling
  2. In data backup process all the data will be written to the master nodes.
  3. Space in local disk may not be huge enough to handle the data growth.

Shards are used to store the actual data. In any production environment each shard will be a separate replica set.

so here our third part of our journey ends hope you like it and i hope that now you have enough knowledge of MongoDB to handle or perform any operation in MongoDB. So bye guys. We will meet in the very next blog and keep learning

--

--