Move ReplicaSet DB from MLAB to AWS EC2

I have been years using MLAB for several applications and their services are super including the staffs who are responsive and expert in solving my problems. In MLAB, I am using ReplicaSet with Arbiter architecture (For more information, please read docs.mongodb.com)

ReplicaSet with Arbiter (docs.mongodb.com)

Because there’s new policy in one of our applications, I need to pull down our database from MLAB to be Self-Managed server and the application’s owner want to use AWS Amazon.

UPGRADING MONGODB VERSION

Upgrading MongoDB version in MLAB is super easy than manual by hand unless you have very good knowledge about MongoDB files and data. My previous version is 2.6 and I want to upgrade to 3.2 . Here is the steps that you need to know:

  1. Make sure your active application has SCRAM-SHA-1 authentication support.
  2. Backup your old version using EBS and MongoDump to S3
  3. Ensure you read the terms of agreement before starting
  4. Upgrade from version 2.6 to version 3.0
  5. Upgrade from version 3.0 to version 3.2
  6. The time processing depends to how big is your data

INSTALL EC2 PACKAGE

You need to be careful when using AWS Package, especially if you are budged application, you can use aws calculator to estimate your monthly or hourly costs

By using amazon, I can easily choose trusted installation MongoDB from MarketPlace to install EC2, I dont need to install from scratch. I prefer to use Percona.

AWS MarketPlace

Choose package based on your traffic for Standard performance: you can choose Memory Optimized / General Purpose and EBS SSD (GP2) but for High Performance: you can use Compute Optimized and EBS IOPs, or you can custom with your own demand and budget. PS: I have tried t2.small (2GB memory and 1 CPU), it can handle smoothly up to 2K concurrent connections (DB), All depends how your database indexes and server configurations.

On Configure Instance Details, you only need to setup 1 instance only and pay attention on IAM role (if any), Shutdown and Termination Protection, and then Tenancy. You can use based on screenshot below or if you want to use dedicated server, you can select tenancy as shared, dedicated instance, or dedicated instance on dedicated host.

I recommend to not store your database in Root device but use external device, it will be helpful in storage scaling or data trading. So keep root device is small 8–15GB, because all installation will take ~3GB, the rests space will be used for logs only and open-files.

Keep the security and remaining service as the same as it is provided from your marketplace or default port 22.

TWEAK UP SERVER

This part is very important. Once your instance is ready then access to your machine. Elad Nava wrote great article about High-Available MongoDB ReplicaSet and I will write again here.

I skip set hostname because I prefer to use static private IP (every restart will not be changed) from AWS Amazon, note: you will be charge for every over-use internet bandwidth and connection between zones or network vpc.

Increase OS Limits

When MongoDB open connection, it will open database file and the operation system will create an entry and store information about the opened file, it is called file descriptor. Every user in Operation System has limit to create file descriptor and you need to increase its number.

vim /etc/security/limits.conf

You need to check maximum open file that is allowed by your memory

cat /proc/sys/fs/file-max
# The following table is estimate based on memory
# 1GB : 100,352 files
# 2GB : 200,704 files
# 4GB : 401,749 files
... and so on

then enter 30% of total maximum files in your system:

# 4GB Memory
# number of files open
* soft nofile
100437
* hard nofile
100437
# number of process
# must be lower than `nofile`
* soft nproc
50218
* hard nproc
50218

Disable Transparent Huge Pages

On Administration’s MongoDB performance documentation, the expert suggest to disable transparent huge pages (THP).

Transparent Huge Pages (THP) is a Linux memory management system that reduces the overhead of Translation Lookaside Buffer (TLB) lookups on machines with large amounts of memory by using larger memory pages.

However, database workloads often perform poorly with THP, because they tend to have sparse rather than contiguous memory access patterns. You should disable THP on Linux machines to ensure best performance with MongoDB.

If you are using tuned or ktune (for example, if you are running Red Hat or CentOS 6+), you must additionally configure them so that THP is not re-enabled. See Using tuned and ktune.
#!/bin/sh
### BEGIN INIT INFO
# Provides: disable-transparent-hugepages
# Required-Start: $local_fs
# Required-Stop:
# X-Start-Before: mongod mongodb-mms-automation-agent
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Disable Linux transparent huge pages
# Description: Disable Linux transparent huge pages, to improve
# database performance.
### END INIT INFO

case $1 in
start)
if [ -d /sys/kernel/mm/transparent_hugepage ]; then
thp_path=/sys/kernel/mm/transparent_hugepage
elif [ -d /sys/kernel/mm/redhat_transparent_hugepage ]; then
thp_path=/sys/kernel/mm/redhat_transparent_hugepage
else
return 0
fi

echo 'never' > ${thp_path}/enabled
echo 'never' > ${thp_path}/defrag

unset thp_path
;;
esac

Make it executable:

sudo chmod 755 /etc/init.d/disable-transparent-hugepages

Set it to start automatically on boot:

sudo update-rc.d disable-transparent-hugepages defaults

Configure the Filesystem

Linux by default will update the last access time when files are modified. When MongoDB performs frequent writes to the filesystem, this will create unnecessary overhead and performance degradation. We can disable this feature by editing the fstab file:

sudo nano /etc/fstab

Add the noatime flag directly after defaults:

LABEL=/   / ext4   defaults,noatime,discard        1 1

If you use extended disk for database (example /mnt is your DB disk)

/dev/sde /mnt auto noatime 0 0

Read Ahead Block Size

In addition, the default disk read ahead settings on EC2 are not optimized for MongoDB. The number of blocks to read ahead should be adjusted to approximately 32 blocks (or 16 KB) of data. We can achieve this by adding a crontab entry that will execute when the system boots up:

sudo crontab -e

Choose nano by pressing 2 if this is your first time editing the crontab, and then append the following to the end of the file:

@reboot /sbin/blockdev --setra 32 /dev/xvda1

Make sure that your EBS volume is mounted on /dev/xvda1. If your MongoDB is in different disk, you can change the mounted path. Save the file and reboot the server:

sudo reboot

Once reboot is finished then verify all your configurations.

ulimit -u # max number of processes
ulimit -n # max number of open file descriptors
# output will be: always madvise [never] 
cat /sys/kernel/mm/transparent_hugepage/enabled
cat /sys/kernel/mm/transparent_hugepage/defrag
# output will be the mounted path
cat /proc/mounts | grep noatime
# output will be: 32
sudo blockdev --getra /dev/xvda1

CREATE MONGODB SECURITY

The most concern of MongoDB weaknesses are predictable ports, no password, open access, no additional security key instead password only.

Change default port

You need to change port to be unpredictable (max. port is 65535)

net:
bindIp: 0.0.0.0
port: 45554

The bind ip is 0.0.0.0 instead 127.0.0.1 because You need the database is accessible from outside, but no worries about security, You will add security group so that only authenticated machine that can access database. If You do not use AWS then you will need to create your own IPTable

Create Superadmin User

This is optional because not all administration need this user. But it will be useful when you need to analyze mongoDB issue without changing and restarting mongodb server to have full command for admin database.

$ mongo localhost:45554/admin
MongoDB shell version: 3.2
connecting to: localhost:45554/admin
> roles = [{ role: "root", db: "admin" }];
> args = {user: "superadmin", pwd: "password", roles: roles}
> db.createUser(args);
> db.auth("superadmin","password");

Enable Internal Authentication

Enforcing access control on a replica set requires configuring:

  • Security between members of the replica set using Internal Authentication, and
  • Security between connecting clients and the replica set using User Access Controls.

The contents of the keyfile serves as the shared password for the members of the replica set. The content of the keyfile must be the same for all members of the replica set.

You can generate a keyfile using any method you choose. The contents of the keyfile must be between 6 and 1024 characters long. Open the path of your mongodb.conf is located (example: /etc/mongodb/)

openssl rand -base64 756 > security.key
chmod 400 security.key

Now, update mongodb.conf

net:
bindIp: 0.0.0.0
port: 45554
storage:
directoryPerDB: true
security:
authorization: "enabled"
keyFile: "/etc/mongodb/security.key"

Restart your mongodb, service mongodb restart then test your authentication

$ mongo localhost:45554/admin
MongoDB shell version: 3.2
connecting to: localhost:45554/admin
> db.stats()
{
"ok" : 0,
"errmsg" : "not authorized on admin to execute command { dbstats: 1.0, scale: undefined }",
"code" : 13
}
> db.auth("superadmin", "password");
1
> db.stats();

Modify Security Group

To protect access from authenticated machine only either application machines or replica machines, then we need to register

I create security group sg-74a88912 to allow any machine that have security group sg-74a88912 and sg-17a8982 can access port 45554, otherwise any access will be blocked

PREPARE REPLICA SERVERS

You just finished a batch configuration for master MongoDB server. Now, Let’s create Image from master server.

Create Image (AMI) from Master DB
Select MasterDB Image for new Replica Instances
Create 2 Instances directly

When manage security group then choose security group that You create on MasterDB, in this example is sg-74a88912. Then wait till the servers are setup completely

Take One-Time MLAB Backup

To move your Data from MLAB to AWS, I suggest to use EBS Snapshot method, because backup with Mongodump will be quite slow if your data is big and also risk on broken file.

Take One-Time Backup

Then open your AWS Snapshot area and select Private Snapshot

MLAB snapshot will be 1 TB, since the snapshot belongs to MLAB then the cost will not be charged to you. Select the snapshot then create volume.

Create Volume from Snapshot

Please remember you need to select Availability Zone as the same as your MasterDB Zone. You can not attach volume from crossed zone.

Select new volume from MLAB snapshot then click Actions to Attach Volume to MasterDB Instance.

Attach volume to MasterDB

Login to your masterDB ssh, then mount the volume :

sudo mkdir -m 000 /mntf 
echo "/dev/sdf /mntf auto noatime 0 0" | sudo tee -a /etc/fstab
sudo mount /mntf
chgrp ec2-user /mntf
chmod g+rwx /mntf

Now the MLAB data is in directory /mntf/data. Because MLAB data contain previous cloud configuration, You need to clean up local folder, but firstly backup the folder

$ cd /mntf/data/rs-ds045554/
$ cd database-name
$ mkdir local.ori
$ cp -R local/ local.ori

Change your mongodb.conf to listen mlab database path

net:
bindIp: 0.0.0.0
port: 45554
storage:
directoryPerDB: true
security:
authorization: "enabled"
keyFile: "/etc/mongodb/security.key"
storage:
dbPath: "/mntf/data/rs-ds045554/database-name"

After restarting your mongodb service then enter your mongo console to clean up the local folder:

$ mongo localhost:45554/admin
MongoDB shell version: 3.2
connecting to: localhost:45554/admin
> db.auth('superadmin', 'password');
> use local;
> db.dropDatabase();
> use admin;
> db.shutdownServer();
> exit

Now please check if local folder is empty in database path. Then restart your mongodb service again. Now test your data.

$ mongo localhost:45554/database-name -u mlabUser -p "mlabPassword"
MongoDB shell version: 3.2
connecting to: localhost:45554/database-name
> db.collection.findOne({})

You should be able to see data from MLAB related to collection.

SETUP SCALABLE MONGODB STORAGE

On previous step you were successfully attaching volume from MLAB snapshot. Now we want to attach new empty volumes for secondary server only (do not need for arbiter).

Please make sure the availability zone is the same the secondary server, after that, attach the volume in device /dev/sdg

Login ssh of secondary server then run the following command to make the volume is accessible.

sudo mkfs /dev/sdg
sudo mkdir -m 000 /mntg
echo "/dev/sdg /mntg auto noatime 0 0" | sudo tee -a /etc/fstab
sudo mount /mntg

Allow mongodb user group to access your database path (/mntg)

chgrp ec2-user /mntg
chmod g+rwx /mntg
cd /mntg
mkdir mongodb

Repeat the same procedure for Primary or MasterDB, so that MasterDB will have 2 volumes, 1 from MLAB and 1 empty volume.

Now edit your secondary mongodb.conf then restart it

net:
bindIp: 0.0.0.0
port: 45554
storage:
directoryPerDB: true
security:
authorization: "enabled"
keyFile: "/etc/mongodb/security.key"
storage:
dbPath: "/mntf/mongodb"
replication:
replSetName: "rs-ds045554"

Login to your arbiter server then edit the mongodb.conf to have replSetName and then restart it

net:
bindIp: 0.0.0.0
port: 45554
storage:
directoryPerDB: true
security:
authorization: "enabled"
keyFile: "/etc/mongodb/security.key"
replication:
replSetName: "rs-ds045554"

Now we want to release MLAB volume to use our own volume with smaller size. Then create new mongodb service in MasterDB with the following configuration, mongodb.mirror.conf

net:
bindIp: 0.0.0.0
port: 45555
storage:
directoryPerDB: true
security:
authorization: "enabled"
keyFile: "/etc/mongodb/security.key"
storage:
dbPath: "/mntg/mongodb"
replication:
replSetName: "rs-ds045554"

Run that mongodb.mirror.conf as servive

mongod --config /etc/mongodb/mongodb.mirror.conf --fork

INITIATE REPLICASET

Now back to MasterDB server and initiate the replicaSet

$ mongo localhost:45554/admin
MongoDB shell version: 3.2
connecting to: localhost:45554/admin
> db.auth("superadmin", "password");
1
> primaryConfig = {"_id" : "rs-ds045554", "members" : [{"_id" : 0,"host" : "private.ip.master:45554"}]}
> rs.initiate(primaryConfig);
> rs.add("private.ip.master:45555");
> rs.add("private.ip.secondary:45554");
> rs.addArb("private.ip.arbiter:45554");

If all configurations have done well, then you will see your console is changed and also you can check replicaSet status

rs-ds045554:PRIMARY> rs.status();
{
"set" : "rs-ds045554",
"date" : "....",
"myState" : 1,
"term" : "....",
"heartbeatIntervalMillis" : "....",
"members" : [
{
"_id" : 0,
"name" : "private.ip.master:45554",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : "....",
"optime" : {
"ts" : "....",
"t" : "...."
},
"optimeDate" : "....",
"electionTime" : "....",
"electionDate" : "....",
"configVersion" : "....",
"self" : true
},
{
"_id" : 2,
"name" : "private.ip.master:45555",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : "....",
"optime" : {
"ts" : "...."
"t" : "...."
},
"optimeDate" : "....",
"lastHeartbeat" : "....",
"lastHeartbeatRecv" : "....",
"pingMs" : "....",
"syncingTo" : "private.ip.master:45554",
"configVersion" : "...."
},
    {
"_id" : 3,
"name" : "private.ip.secondary:45554",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : "....",
"optime" : {
"ts" : "...."
"t" : "...."
},
"optimeDate" : "....",
"lastHeartbeat" : "....",
"lastHeartbeatRecv" : "....",
"pingMs" : "....",
"syncingTo" : "private.ip.master:45554",
"configVersion" : "...."
},
{
"_id" : 4,
"name" : "private.ip.arbiter:45554",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : "....",
"lastHeartbeat" : "....",
"lastHeartbeatRecv" : "....",
"pingMs" : "....",
"configVersion" : "...."
}
],
"ok" : 1
}

After all sync process are finished completely, you need to shutdown primary database.

$ mongo localhost:45554/admin
MongoDB shell version: 3.2
connecting to: localhost:45554/admin
rs-ds045554:PRIMARY> db.shutdownServer();

then stop mongodb service for private.ip.master:45554 , service mongodb stop or kill the IP is enough because you have stopped server securely from console. Now, private.ip.master:45555 , edit your mongodb.conf and change the storage path to use /mntg/mongodb

net:
bindIp: 0.0.0.0
port: 45554
storage:
directoryPerDB: true
security:
authorization: "enabled"
keyFile: "/etc/mongodb/security.key"
storage:
dbPath: "/mntg/mongodb"
replication:
replSetName: "rs-ds045554"

You need to stop server mirror port 45555

$ mongo localhost:45555/admin
MongoDB shell version: 3.2
connecting to: localhost:45555/admin
rs-ds045555:PRIMARY> db.shutdownServer();

and kill the service mongodb 45555, now private.ip.secondary:45554 is a primary. Remove mongodb.mirror.conf file and start the original masterdb service for mongodb service mongodb start , you will see that private.ip.master:45554 is now Secondary.

Unmount mongodb MLAB umount -l /mntf and remove /mntf from /etc/fstab

Please check rs.status() in Primary database (secondary server), if all servers are good (not in repair or sync process) then you can shutdown secondary server and then restart it by service restart mongodb, now you should get the following condition

private.ip.master:45554 is Primary
private.ip.secondary:45554 is Secondary
private.ip.arbiter:45554 is Arbiter

You are free now to delete volume from MLAB because you will be charged for it per GB but you can not delete snapshot from MLAB because it is owned by MLAB.

Cheers!

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.