Mongo replication with python & mongoose

Ahmed Charef
Jul 29, 2018 · 4 min read

MongoDB is an open source database that uses a document-oriented data model, classified as a NoSQL database type.

MongoDB is a popular open-source document-oriented database developed by 10gen, one of the top features is that it can stores data in flexible, JSON-like documents, meaning fields can vary from document to document and data structure can be changed over time.


Replication

Replication is the process of synchronizing data across multiple servers.

Replica sets provide redundancy and high availability, and are the basis for all production deployments.

Also replication provides redundancy and increases data availability. With multiple copies of data on different database servers, replication provides a level of fault tolerance against the loss of a single database server.

Replication also allows you to recover from hardware failure and service interruptions.

A replica set is a cluster of MongoDB database servers that implements master-slave (primary-secondary) replication. In a replica, one node is primary node that receives all write operations. All other instances, such as secondaries, apply operations from the primary so that they have the same data set.

source: docs.mongodb.com

Replica sets can fail, so if one of the members becomes unavailable, a new primary host is elected and your data is still accessible. That means, when a primary replica fails, the replica set automatically conducts an election process to determine which secondary should become the primary.

Why Replication?

  • To keep your data safe
  • High (24*7) availability of data
  • Disaster recovery
  • No downtime for maintenance (like backups, index rebuilds, compaction)
  • Read scaling (extra copies to read from)
  • Replica set is transparent to the application

Create a Replica Set:

If you want to start a mongod process as part of a Replica Set you can use this parameters:

./bin/mongod --port <PORT> --bind_ip <IP-ADRESS> --replSet <REPL-NAME> --rest

for example:

./bin/mongod --port 27017 --bind_ip localhost dbpath ./data1 --replSet ahmed

Before, execute the command line, make sure that mongo service is down.

The command line above starts each instance as a member of a replica set named ahmed, each running on a distinct port, and specifies the path to your data directory with the --dbpath setting. The instances bind to both the localhost and the ip address of the host.

We can add “–rest” option, to have a small web application showing us the status of our replica set. (open a browser and navigate to the ip address of the PRIMARY node).

Add Members to Replica Set

To add members to replica set and start mongod instances on multiple machines. First, start a mongo client and issue a command rs.add().

The basic syntax of rs.add() command is as follows:

>rs.add(HOST_NAME:PORT)

Or you can create instances using command line:

$ mongod --replSet rs1 --dbpath ./data2 --port 27018 --fork$ mongod --replSet rs1 --dbpath ./data3 --port 27019 --fork

To test if everything was fine.

mongors0:PRIMARY> rs.status()
{
"set" : "ahmed",
"date" : ISODate("2018-11-25T01:05:40.095Z"),
"myState" : 1,
"term" : NumberLong(1),
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1511294739, 1),
"t" : NumberLong(1)
},
"appliedOpTime" : {
"ts" : Timestamp(1511294739, 1),
"t" : NumberLong(1)
},
"durableOpTime" : {
"ts" : Timestamp(1511294739, 1),
"t" : NumberLong(1)
}
},
"members" : [
{
"_id" : 0,
"name" : "localhost:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 11198,
},
{
"_id" : 1,
"name" : "localhost:27018",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 11119,
},
{
"_id" : 2,
"name" : "localhost:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 10935,
}
],
"ok" : 1
}

All set. We’ve successfully configured mongodb replicaset.


Mongoose

Mongoose is an Object Document Mapper (ODM). This means that Mongoose allows you to define objects with a strongly-typed schema that is mapped to a MongoDB document.

Mongoose is a JavaScript framework that is commonly used in a Node.js application with a MongoDB database.

I am going to use it in a Node.js application. If you already have Node.js installed, you can move on to the next step. If you do not have Node.js installed, I suggest you begin by visiting theNode.js Download page and selecting the installer for your operating system.

Next install Mongoose from the command line using npm:

$ npm install mongoose

Let’s connect to a MongoDB database. I’ve placed the following code inside an index.js file because I chose that as the starting point for my application:

var mongoose = require('mongoose');
mongoose.connect('mongodb://localhost:27017/db_name');

The first line of code includes the mongoose library. Next, I open a connection to a database that I've called db_name using the connect function.

To connect to a replica set you pass a comma delimited list of hosts to connect to rather than a single host.

mongoose.connect('mongodb://[username:password@]host1[:port1][,host2[:port2],...[,hostN[:portN]]][/[database][?options]]' [, options]);

For example:

mongoose.connect('mongodb://localhost:27017,localhost:27018,localhost:27019/db_name?replicaSet=ahmed');

PyMongo

PyMongo is a Python distribution containing tools for working with MongoDB, and is the recommended way to work with MongoDB from Python.

PyMongo makes it easy to write highly available applications whether you use a single replica set or a large sharded cluster.

I recommend using pip to install pymongo on all platforms:

$ python -m pip install pymongo

A connection to a replica set can be made using the MongoClient() constructor, specifying one or more members of the set, along with the replica set name. Any of the following connects to the replica set we just created:

>>> from pymongo import MongoClient>>>db=MongoClient('mongodb://localhost:27017,localhost:27018,localhost:27019/?replicaSet=foo').db_name

Note:

Sharding is the process of storing data records across multiple machines and it is MongoDB’s approach to meeting the demands of data growth.

  • Sharding partitions the data-set into discrete parts.
  • Replication duplicates the data-set.

By sharding, you split your collection into several parts but replicating your database means you make mirrors of your data-set.

These two things can stack since they’re different. Using both means you will shard your data-set across multiple groups of replicas.


I hope you can set the mongo replication into your servers.

Ahmed Charef

Written by

Computer Science Engineer, Big Data developer, IoT Trainer, Ethereum Developer, Full Stack web developer, DIY Hacker, Creative, Innovator, Ambitious.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade