Steps to Install MongoDB on AWS EC2 Instance

Calvin Hsieh
7 min readJun 14, 2018

--

Updated 2/14/19: the contents of this post has been updated to install MongoDB v4.0

I found myself digging around Google constantly for answers on how to properly setup a MongoDB box on AWS EC2 with authentication and replica set. There was no clear or straight answer on how to accomplish. Thus, I want to share with you what I’ve came up with to setup a MongoDB box on AWS EC2.

Before we get started, you should have basic knowledge on vim and launching EC2 instance via AWS console.

Launch EC2 Instance

Setup EC2 Instance

  1. Pick Amazon Linux 2 AMI
  2. Pick desired instance type
  3. Configure instance with proper VPC, subnet, etc
    Launch 3 instances if you’d like to setup replica set. More details follows.
  4. Important: Add three additional EBS volumes to setup the correct file system for MongoDB
    - Device: /dev/sdf, Size: x GB (put desired size) [For data]
    - Device: /dev/sdg , Size: x GB (put desired size) [For journal]
    - Device: /dev/sdh , Size: x GB (put desired size) [For log]
    It’s recommended that journal is 1/5 of data size and log is 1/2 of journal size. You can always increase the EBS size later as needed but decrease. Put rest in default or to your needs.
What step 4 should look like on AWS console

Finish up the rest of the steps to launch the instance

Security Group

Create or pick existing security group that has SSH port with your IP set for inbound rules

  • Create new security group, eg.mongodb-replica-access
  • Type: SSH; Port 22; Source: your IP (so you can access the box directly later)

Launch EC2

If you are planing to use replica set, make sure the instances created are in the same security group so they can communicate with each other. After the instances are created, go to your security group and find mongodb-replica-access that just created. It should have a group ID (sg-abc123).

Add below additional inbound rules:

  • Type: Custom TCP; Port 27017; Source: sg-abc123
What mongodb-replica-access security group should look like

Download MongoDB

SSH into the instance

ssh -i path_to_keypair ec2-user@ip_address

Update packages

sudo yum -y update

Install MongoDB

Create a file to download MongoDB directly using yum

sudo vi /etc/yum.repos.d/mongodb-org-4.0.repo

Copy/paste the following to repo file

[mongodb-org-4.0]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/amazon/2013.03/mongodb-org/4.0/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-4.0.asc

Install MongoDB packages

sudo yum -y install mongodb-org

Check you have MongoDB installed properly

which mongo  # should print /usr/bin/mongo

Configure MongoDB

Configure the File System

Mount each volume and set ownership

sudo mkfs.xfs -L mongodata /dev/sdf
sudo mkfs.xfs -L mongojournal /dev/sdg
sudo mkfs.xfs -L mongolog /dev/sdh
sudo mkdir /data
sudo mkdir /journal
sudo mkdir /log
sudo mount -t xfs /dev/sdf /data
sudo mount -t xfs /dev/sdg /journal
sudo mount -t xfs /dev/sdh /log
sudo ln -s /journal /data/journal
sudo chown mongod:mongod /data
sudo chown mongod:mongod /log/
sudo chown mongod:mongod /journal/

Define disk partitions

sudo vi /etc/fstab

Append the following code to specify disk partitions

/dev/sdf /data    xfs defaults,auto,noatime,noexec 0 0
/dev/sdg /journal xfs defaults,auto,noatime,noexec 0 0
/dev/sdh /log xfs defaults,auto,noatime,noexec 0 0

MongoDB needs to be able to create file descriptors when clients connect and spawn a large number of processes in order to operate effectively. The default file and process limits shipped with Ubuntu are not applicable for MongoDB.

Modify them by editing the limits.conf file:

sudo vi /etc/security/limits.conf

Add the following lines to the end of the file:

* soft nofile 64000
* hard nofile 64000
* soft nproc 32000
* hard nproc 32000

Next, create a file called 90-nproc.conf in /etc/security/limits.d/:

sudo vi /etc/security/limits.d/90-nproc.conf

Paste the following lines into the file:

* soft nproc 32000
* hard nproc 32000

Confirm disks have mounted properly

df -h

You should see something similar like below:

Filesystem      Size  Used Avail Use% Mounted on
devtmpfs 479M 0 479M 0% /dev
tmpfs 494M 0 494M 0% /dev/shm
tmpfs 494M 13M 482M 3% /run
tmpfs 494M 0 494M 0% /sys/fs/cgroup
/dev/xvda1 8.0G 1.3G 6.8G 16% /
tmpfs 99M 0 99M 0% /run/user/1000
/dev/xvdf 20G 33M 20G 1% /data
/dev/xvdg 4.0G 33M 4.0G 1% /journal
/dev/xvdh 2.0G 33M 2.0G 2% /log

Disable Transparent Huge Pages

Create the following file

sudo vi /etc/init.d/disable-transparent-hugepages

Copy/paste the following code to the file

#!/bin/bash
### BEGIN INIT INFO
# Provides: disable-transparent-hugepages
# Required-Start: $local_fs
# Required-Stop:
# X-Start-Before: mongod mongodb-mms-automation-agent
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Disable Linux transparent huge pages
# Description: Disable Linux transparent huge pages, to improve
# database performance.
### END INIT INFO
case $1 in
start)
if [ -d /sys/kernel/mm/transparent_hugepage ]; then
thp_path=/sys/kernel/mm/transparent_hugepage
elif [ -d /sys/kernel/mm/redhat_transparent_hugepage ]; then
thp_path=/sys/kernel/mm/redhat_transparent_hugepage
else
return 0
fi
echo 'never' > ${thp_path}/enabled
echo 'never' > ${thp_path}/defrag
re='^[0-1]+$'
if [[ $(cat ${thp_path}/khugepaged/defrag) =~ $re ]]
then
# RHEL 7
echo 0 > ${thp_path}/khugepaged/defrag
else
# RHEL 6
echo 'no' > ${thp_path}/khugepaged/defrag
fi
unset re
unset thp_path
;;
esac

Make it executable

sudo chmod 755 /etc/init.d/disable-transparent-hugepages

Setup Log Rotation

Rotation happens daily, rotate every 7 days. On rotate, it’ll send signal -USR1 signal to Mongo

Create log rotation rule

sudo vi /etc/logrotate.d/mongodb

Copy/paste the following code to the file

/log/mongod.log {
daily
rotate 7
compress
missingok
sharedscripts
nodateext
postrotate
kill -USR1 $(cat /data/mongod.lock)
endscript
}

You can find out more on log rotate configuration definition here: https://gist.github.com/pagebrooks/6390198

Tip: You can test log rotation after MongoDB is setup by running: sudo logrotate -v -f /etc/logrotate.d/mongodb

Update MongoDB config

sudo vi /etc/mongod.conf

Update the values

systemLog:
logRotate: reopen # add this line
path: /log/mongod.log # update
storage:
dbPath: /data # update
# bindIp: 127.0.0.1 # comment this out
bindIpAll: true # add this line

This is what mongod.conf should look like

# mongod.conf# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/
# where to write logging data.
systemLog:
destination: file
logAppend: true
logRotate: reopen
path: /log/mongod.log
# Where and how to store data.
storage:
dbPath: /data
journal:
enabled: true
# engine:
# mmapv1:
# wiredTiger:
# how the process runs
processManagement:
fork: true # fork and run in background
pidFilePath: /var/run/mongodb/mongod.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
# network interfaces
net:
port: 27017
# bindIp: 127.0.0.1 # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.
bindIpAll: true
# and more...

Have MongoDB start automatically on boot

sudo chkconfig mongod on

Start MongoDB server

sudo service mongod start

Repeat the above steps under “Download MongoDB”, “Install MongoDB”, and “Configure MongoDB” if you’re setting up replica set

Create MongoDB User

If you’re setting up replica set, only need to create the user on the primary node.

Connect to MongoDB

mongo

Create a root user

use admin
db.createUser({ user: "admin", pwd: "password", roles: ["root"] })

Only Update security in MongoDB config after you have successfully created an admin user and if you don’t plan to use replica set.

sudo vi /etc/mongod.confsecurity:
authorization: enabled

Tip: You can double check if authorization is setup correctly by connecting to mongo without a user and try to run a MongoDB query. It should give you an unauthorized error.

Replication

If you would like to create a replica set, you can do so by following the steps below. Make sure MongoDB is properly setup and configured on all the instances created before continuing.

Create a keyfile on your local machine

openssl rand -base64 741 > keyfile

Copy the keyfile to all your replica instances

scp -i path_to_keypair keyfile ec2-user@ip_address:~/keyfile

Get on each of the instances

ssh -i path_to_keypair ec2-user@ip_address

Put the keyfile in the right place and grant the right permission

sudo mkdir -p /opt/mongod
sudo mv keyfile /opt/mongod/
sudo chown mongod:mongod /opt/mongod/keyfile
sudo chmod 600 /opt/mongod/keyfile

Update MongoDB config

sudo vi /etc/mongod.conf

Enable security and replication

security:                       # uncomment
# authorization: enabled # comment out if exists
keyFile: /opt/mongod/keyfile # add
replication: # uncomment
replSetName: aName # add

Restart MongoDB server

sudo service mongod restart

After all instances restart successfully. Get on one of the instances that was setup. Use this node as the primary node going forward.

ssh -i path_to_keypair ec2-user@primary_node_ip_address

Connect to MongoDB with newly created user

mongo -u admin -p password admin

After you successfully connect to MongoDB, setup the replication

rs.initiate()

Make sure the replica set member name/host is connected directly using IP instead of DNS address; otherwise, it won’t work properly

var config = rs.config()
config.members[0].host = "<ip>:27017"
rs.reconfig(config)

Add the rest of the instances to replica set

rs.add("<ip>") # keep adding until you add it all

Confirm replica set configuration

rs.status()

If you have 3 nodes in replica set, members should contain 3 elements in the array: one primary (your current instance IP), two secondary (added viars.add() )

Tip: Make sure the instances are on the same security group that they can communicate to each other

Tip: If you want to look at data on secondary nodes directly, connect to MongoDB on the secondary node and run rs.slaveOk()

Reference

--

--

Calvin Hsieh

Co-Founder & CTO ofStormX. Entrepreneur, engineer. Blockchain enthusiast