Djnago/djongo How to migrate after creating unique index in mongo db

softmarshmallow
거대 공룡
Published in
3 min readFeb 27, 2020
Photo by Faisal M on Unsplash

I am using Django/djongo/mongodb/drf stack for my server development. the server itself is centerized crawling manager, so multiple crawler instance can store crawled news data with consistency.

In my specific occasion, i face this specific problem. I defined model named article_url which was not configured unique. and added validator for article_url not to be conflicted. (in drf’s serializer) since validator is expensive operation, and i’m handling with massive data, i needed prevent this in database’s layer

my original model looked like,

from djongo import modelsclass RawNews(models.Model):
class Meta:
db_table = 'raw'
_id = models.ObjectIdField()
time = models.DateTimeField()
title = models.CharField(max_length=200)
article_url = models.URLField(max_length=200)
body_html = models.TextField()

after i changed (added unique=True to field article_url)

from djongo import modelsclass RawNews(models.Model):
class Meta:
db_table = 'raw'
_id = models.ObjectIdField()
time = models.DateTimeField()
title = models.CharField(max_length=200)
article_url = models.URLField(max_length=200, unique=True)
body_html = models.TextField()

then,

python manage.py makemigrations -> works fine as usual

python manage.py migrate ->

Pymongo error: {‘ok’: 0.0, ‘errmsg’: ‘E11000 duplicate key error collection: db.raw index: raw_article_url_uniq dup key: { article_url: “https://~…

to resolve this, you will have to remove confliction for the key you want to make unique manually, and run migrate again.

Lets go for it!

TL;DR

  • how to create new index as unique in mongodb collection
  • how to remove confliction data so that unique index can be created error free.
  • finally, how to migrate django/djongo/mongodb project with new model unique field

how to create new index as unique in mongodb collection

db.raw.ensureIndex( { article_url:1 }, { unique:true, dropDups:true } )

the above command should work out only if you are using mongodb 2.X

if you are using mongodb 3.X you will receive below message.

{
"ok" : 0,
"errmsg" : "E11000 duplicate key error collection: db.raw index: article_url_1 dup key: { article_url: "https://~~~" }",
"code" : 11000,
"codeName" : "DuplicateKey",
"keyPattern" : { "article_url" : 1 },
"keyValue" : { "article_url" : "https://~~~" }
}

Since mongodb 3.X dropDups (drop duplicates) is deprecated, and being ignored.

the official document says it’s ok / but it’s not (document not updated since 2020.2). so, above command will be obsolete, and you will have to type

db.raw.ensureIndex( { article_url:1 }, { unique:true } )

you will still receive the same error, you will first have to remove conflict keys manually to create unique indexes. (follow below)

how to remove conflicted data so that unique index can be created error free.

db.raw.find({}, {article_url:1}).sort({_id:1}).forEach(function(doc){     db.raw.remove({_id:{$gt:doc._id}, article_url:doc.article_url}); })

this operation might take a while depending on your collection’s size. once this step is compete, you can now create unique index for the key (run command above section again)

misc::additional commands

// list created index for collection 'raw' <- replace this
db.raw.getIndexes()
// remove index created with the index name (you can list with above command)
db.raw.dropIndex(<index_name>)

References

https://docs.mongodb.com/manual/tutorial/manage-indexes/

--

--