Guide to MongoDB operations - Part 1

Arthi Murali
5 min readMar 31, 2023

--

What Datatypes does MongoDB recognize?

Python data types are converted to JSON equivalent like

For example, when a dictionary is passed, the mongoDB recognizes it as an object in JSON

MongoDB operations

client = pymongo.MongoClient("mongodb+srv://meetarthi:Farmer246@cluster0.gkstsrk.mongodb.net/?retryWrites=true&w=majority")
dv = client.e12
collection=dv.superman

Documents created will be stored in the collection superman.

d2 = {
"_id":151,
"float":1.0020,
"list":[1,2,"three"],
"dict":{"neth":1,"wd":"day"},
"tuple" :(1,2,3,4),
"name": "Arthi",
"id":123,
"bool":True,
"no": None
}

collection.insert_one(d2)

A document d2 is created with key value pairs.

Display of the document in mongodb atlas

_id is a keyname, and the _ before id mentions that the id cannot be modified, _id should be unique, and the value assigned to the given id cannot be duplicated. If we try to give the same value of 151 to another id, there will be an error. If the user doesn’t give an _id to the document, an auto-generated object id would be given. (Refer to the pic below.)

Auto-generated object Id
dict1={
"name" : input("enter you name"),
"age":int(input("enter age")),
"gender":input("enter gender")
}


collection.insert_one(dict1)

Instead of entering the value for the key, an input of name, age, and gender is taken from the user.

mylist = [
{ "name": "guvil", "address": "chennai"},
{ "name": "92", "address": "bengaluru"},
{ "name": "93", "address": "mumbai"},
{ "name": "g4", "address": "delhi"},
{"_id":1,"name": "John", "address": "chennai"}
]

collection.insert_many(mylist)

In a list “mylist”, 5 different documents are created , and inserted into the database using insert_many method which iterates through the list, and documents are stored as individual JSON document .

5 individual JSON documents have been created
try:
mylist = [
{ "name": "guvi1", "address": "chennai"},
{ "name": "g2", "address": "bengaluru"},
{ "name": "g3", "address": "mumbai"},
{ "name": "g4", "address": "delhi"},
{ "_id":111, "name": "John", "address": "chennai"}
]




x=collection.insert_many(mylist)


except:
print("insert stopped")

try method allows to insert a list “mylist” with multiple documents, and if the list is not inserted, the except method will catch any type of error or exception and print “insert stopped”. These types of errors may occur if the _id is duplicated (one of the reasons).



dict_2 = {
"Name" : input(),
"Address" : input(),
"Emailid" : input(),
"phoneNo" : int(input())
}


b = collection.insert_one(dict_2)
if b != None:
print("Inserted Successfully!")
else:
print("Documents not inserted properly")

If the document dict_2 is created and the object is stored in the variable b, the output would be Inserted Successfully, else the output would be Documents not inserted properly.

Inserting and retrieving an image from MongoDB database

from PIL import Image
import io


im = Image.open("/content/village.jpg")




image_bytes = io.BytesIO()
im.save(image_bytes, format='PNG')


dict1 = {
"_id":"Arthi",
'image': image_bytes.getvalue()
}


collection.insert_one(dict1)

The image data is converted to its bytes and stored in PNG format in variable called image_bytes . A document dict1 with keys _id and image, and values Arthi and bytes in image_bytes, is created and dict1 is inserted into the database.

Before version 4.0 of MongoDB, the image data was stored as a string. After version 4.0, the data is stored as binary data, or bin data.(Make sure the image file is kept in the files of google collab, if you are using collab)

from PIL import Image
import io

# Find the document that contains the image you want to retrieve
retrieved_doc = collection.find_one({"_id": "Arthi"})

# Read the image data from the retrieved document
image_data = retrieved_doc['image']

# Load the image using PIL
pil_img = Image.open(io.BytesIO(image_data))

# Display the image (optional)
pil_img.show()

The document with _id : Arthi was retrieved from the database using the find_one method, and key pairs, which are the image bytes of the data, were retrieved and stored in the variable retrieved_doc. The value of image, which is bytes of the data, is stored in the variable image_data. The image is opened using the bytes and displayed as output.

Image retrieved from the Database

Find operation

x = collection.find_one()
print(x)

find_one will give the first data inserted into the database.

y = collection.find().limit(20)
for x in y:
print(x)

The above code will display the first 20 documents in the database.

# new document 'a' is inserted into the collection superman
a = {'name':'bbc','age':9}
collection.insert_one(a)
for x in collection.find({'address':'chennai'}):
print(x)

Every document in collection were address is chennai is returned.

collection.find_one({'address':'chennai'})

The first document were address is chennai, is returned.

for x in collection.find({'address':'chennai'},{'_id':False}):
print(x)

Every document in collection where the address is chennai is returned without the _id.

for x in collection.find({"$or": [{"address": "chennai"}, {"address": "mumbai"}]}, {"_id": 0,'address':0}):
print(x)

The documents with an address, chennai or mumbai are returned with only the name.

Regular expression (regex)

Regular expressions are used to find patterns in a sequence. Regex is a fundamental concept in NLP, as it can be used to identify and extract patterns from text data.

b = collection.find({"address": {"$regex": "nai$"}})
for x in b:
print(x)

Documents with addresses ending in “nai” are returned.

b = collection.find({"address": {"$regex": "^m.m"}})
for x in b:
print(x)

Documents with addresses,starting with the letter m and the consecutive second letter being m, are returned.

Comparison operators in MongoDB

a = collection.find({"age" : {"$gte":10}})
for x in a:
print(x)

Documents with age greater than or equal to 10 is returned.

a = collection.find({"age": {"$lte": 10}})
for x in a:
print(x)

Documents with age less than or equal to 10 is returned.

a = collection.find({"age" : {"$eq":21}})
for x in a:
print(x)

Documents with age equal to 21 is returned.

a = collection.find({"age" : {"$ne":10}}).limit(5)
for x in a:
print(x)

first five documents with age not equal to 10 is returned. The results displayed above doesn’t have any key “age”, but still displayed: it is because all documents with or without age, other than documents where age is equal to 10 ,all the other documents satisfies the filter condition.

--

--

Arthi Murali

Data science enthusiast | Data engineering | Biomedical Science Graduate | Chemoinformatics