Creating a Mongo cluster in AWS with Ansible

Published in

7Factor Software

11 min readOct 19, 2016

Last time we visited AWS and Ansible our focus was on project organization, high level concepts, and the miscellaneous administrative things we need to do in order to happily secure access to an AWS account during execution. Next, I’d like to focus on actually building something using the framework referenced in the linked post: specifically a simple, replicated MongoDB cluster.

Google isn’t short on ways to solve this, and I’d like to differentiate a bit by focusing on how we can build repeatable deployments that can be taken to any region or AZ group with minimal effort. We’re going to be setting our project up the exact same way we did in the previous post. Here’s the directory structure we’re working with:

- ansible
  |-keys
  +-regions 
    |-mongodb_us-east-1_default-install.yml
  +-roles 
    +-create_mongo_ec2
      +-tasks
        |-main.yml
    +-init_mongo_replication
      +-tasks
        |-main.yml
      +-templates
        |-init_replication.j2
    +-init_vpc
      +-tasks
        +-main.yml
    +-install_mongo
      +-tasks
        |-main.yml
      +-templates
        |-mongod.j2
        |-mongodb-org-3.2.repo
  +-vars
    |-config.yml
    |-credentials.yml
  |-ansible.cfg
  |-dbservers.yml
  |-vault-password.txt
  |-unlock-keys.sh
  |-lock-keys.sh

Below is a diagram of what we’re going to build. It’s a fairly vanilla deployment designed to be available if you lose an AZ.

Go grab a copy of this project here. To make this Work on Your Machine™ follow the instructions in the readme carefully. And, again as stated, don’t ever check in an unencrypted credentials file. I’m not responsible for your AWS bill if you do.

So what happens?

Here’s a high level state machine describing the high level steps that Ansible will take during the playbook run.

To start, let’s examine the deployment blob:

mongodb_us-east-1_default-install.yml

---
deployment:
  group: "{{ deployment_group }}"
  region: "{{ target_region }}"
  environment: "{{ env }}"
  vpc_cidr: 172.24.0.0/16
  mongodb:
    ami_id: ami-c481fad3
    instance_type: t2.micro
    replica_set: "repl-{{ target_region }}"
    leader:
      zone: "us-east-1b"
      subnet_cidr: 172.24.1.0/24
    azs:
      - zone: "us-east-1c"
        ensure_count: 2
        subnet_cidr: 172.24.2.0/24
      - zone: "us-east-1b"
        ensure_count: 1
        subnet_cidr: 172.24.1.0/24

This should be fairly intuitive. In this deployment, we’ve chosen to automagically build out the VPC and corresponding subnets during deployment. As discussed last time, you can alternatively replace the vpc_cidr and subnet_cidr bits with identifiers of networking elements you’ve already created. I like creating them on the fly — less work for me. We’re specifying that we’d like a leader node in us-east-1c and three minion nodes in various other availability zones and in corresponding subnets.

Now consider the main playbook:

mongo-in-vpc.yml

---
- hosts: localhost
  gather_facts: False
  connection: local
  vars_files:
    - vars/config.yml
    - vars/credentials.yml
    - "regions/mongodb_{{ target_region }}_{{ deployment_group }}.yml"
  roles:
    - init_vpc

# Build the DB servers
- include: mongo.yml

We begin by creating a VPC via the init_vpc role. Next, we include the dbservers.yml playbook which will execute the appropriate set of roles and tasks for creating the mongo stack. Here’s how the VPC is built:

roles/init_vpc/tasks/main.yml

---
- name: Create a VPC for the whole stack
  ec2_vpc:
    aws_access_key: "{{ access_key_id }}"
    aws_secret_key: "{{ secret_access_key }}"
    region: "{{ target_region }}"
    state: present
    cidr_block: "{{ deployment.vpc_cidr }}"
    resource_tags:
      Name: "MongoDBVpc_{{ deployment.environment }}"
      Application: "{{ application }}"
    internet_gateway: True
  register: mongo_vpc

- name: Create a subnet for the leader node
  ec2_vpc_subnet:
    aws_access_key: "{{ access_key_id }}"
    aws_secret_key: "{{ secret_access_key }}"
    region: "{{ target_region }}"
    vpc_id: "{{ mongo_vpc.vpc_id }}"
    az: "{{ deployment.mongodb.leader.zone }}"
    cidr: "{{ deployment.mongodb.leader.subnet_cidr }}"
    resource_tags:
      Name: "MongoDBLeaderSubnet"
      Application: "{{ application }}"
  register: mongo_leader_subnet

- name: Per AWS, create a subnet per AZ
  ec2_vpc_subnet:
    aws_access_key: "{{ access_key_id }}"
    aws_secret_key: "{{ secret_access_key }}"
    region: "{{ target_region }}"
    vpc_id: "{{ mongo_vpc.vpc_id }}"
    az: "{{ item.zone }}"
    cidr: "{{ item.subnet_cidr }}"
    resource_tags:
      Name: "MongoDBSubnet"
      Application: "{{ application }}"
  register: mongo_subnets
  with_items: "{{ deployment.mongodb.azs }}"

- name: Iterating over subnets and registering subnet IDs to AZs
  set_fact:
    mongo_subnets_to_azs: "{{ mongo_subnets_to_azs + [ { 'subnet_id': item.subnet.id , 'zone': item.subnet.availability_zone }] }}"
  with_items: "{{ mongo_subnets.results }}"

After creating the VPC based on the CIDR block provided in the deployment blob we create a specific subnet for the leader node. You could, if you wanted, map the leader subnet to a corresponding subnet with the same CIDR built in a minion’s configuration; that is an exercise left up to the reader. Next, we iterate through the set of availability zones configured in the deployment blob and create a subnet based on the provided CIDR block. Finally, we register those subnets per availability zone and store them in a global fact declared in our common config as seen below. This will come in useful later so hang on for now.

config.yml

# global variables 

# used to map subnet values to availability zones.
# not sure of a better way to do this at the moment
dbtier_subnets_to_azs: []

Building the infrastructure

Now that our basic networking is in place there are three things we need to automate in order to deploy a replicated mongo cluster.

A leader node. Mongo requires a single node to handle initialization of a replica set. In the mongodb section of our deployment blob we have a leader key that describes where the leader node will be installed.
Some number of replica nodes. These nodes are described in the remainder of the mongodb key in the deployment blob. In our example we have three additional nodes each in a different availability zone.
Replication. We’ll need to configure all nodes to use a replica set and pass a templated JavaScript file into the mongo command on the leader node to initialize replication.

Let’s start at the top with the base mongo playbook.

mongo.yml

---
# create EC2 instances
- hosts: localhost
  gather_facts: False
  vars_files:
    - vars/config.yml
    - vars/credentials.yml
    - "regions/mongodb_{{ target_region }}_{{ deloyment_group }}.yml"
  roles:
    - role: create_ec2_mongo_stack

# configure the mongo hosts
- hosts: mongo:mongodb_leader
  gather_facts: False
  become: yes
  vars_files:
    - vars/config.yml
    - vars/credentials.yml
    - "regions/mongodb_{{ target_region }}_{{ deployment_group }}.yml"
  roles:
    - role: install_mongo

# initialize replication from the leader
- hosts: mongodb_leader
  gather_facts: False
  vars_files:
    - vars/config.yml
    - vars/credentials.yml
    - "regions/mongodb_{{ aws_region }}_{{ deployment_group }}.yml"
  roles:
    - init_mongo_replication

First, we create the appropriate EC2 instances via the create_ec2_mongo_stack role. Next, we install mongo on our leader and follower nodes and template mongod.conf with the replica set name. Finally, we initialize replication via the leader node. Building the EC2 instances is the most complex part of the project as seen below.

roles/create_ec2_mongo_stack/tasks/main.yml

---
# Create security group
- name: Create security groups for Mongo
  ec2_group:
    name: "mongo-access-{{ deployment.region }}"
    description: "Security group to control mongo access"
    aws_access_key: "{{ access_key_id }}"
    aws_secret_key: "{{ secret_access_key }}"
    region: "{{ deployment.region }}"
    vpc_id: "{{ mongo_vpc.vpc_id }}"
    rules:
      - proto: tcp
        from_port: 27017
        to_port: 27019
        cidr_ip: 0.0.0.0/0
      - proto: tcp
        from_port: 28017
        to_port: 28017
        cidr_ip: 0.0.0.0/0
      - proto: tcp
        from_port: 22
        to_port: 22
        cidr_ip: 0.0.0.0/0
    rules_egress:
      - proto: all
        cidr_ip: 0.0.0.0/0
  register: mongo_security

######################
# MongoDB leader
######################
- name: Create the single mongoDB leader node
  ec2:
    aws_access_key: "{{ access_key_id }}"
    aws_secret_key: "{{ secret_access_key }}"
    instance_type: "{{ deployment.mongodb.instance_type }}"
    image: "{{ deployment.mongodb.ami_id }}"
    wait: yes
    region: "{{ target_region }}"
    assign_public_ip: yes
    vpc_subnet_id: "{{ mongo_leader_subnet.subnet.id }}"
    group_id:
      - "{{ mongo_security.group_id }}"
    instance_tags:
      Application: "{{ application }}"
      Name: "MongoDBLeader"
      Environment: "{{ deployment.environment }}"
      Deployment-Group: "{{ deployment.group }}"
      Type: MongoDBLeader
    exact_count: 1
    count_tag:
      Type: MongoDBLeader
      Deployment-Group: "{{ deployment.group }}"
      Environment: "{{ deployment.environment }}"
    zone: "{{ deployment.mongodb.leader.zone }}"
    key_name: "{{ ssh_key_name }}"
  register: mongo_leader_box

# Wait for SSH to come up
- name: Check for SSH on the leader box
  wait_for:
    host: "{{ mongo_leader_box.tagged_instances.0.public_dns_name }}"
    port: 22
    timeout: 320
    state: started
    search_regex: OpenSSH
    delay: 10

# Add hosts to the list for processing.
- name: Add the leader box to host inventory
  add_host:
    hostname: "{{ mongo_leader_box.tagged_instances.0.public_dns_name }}"
    groups: mongo_leader
    private_ip: "{{ mongo_leader_box.tagged_instances.0.private_ip }}"
    ansible_user: "{{ ec2_ssh_user }}"
    ansible_ssh_private_key_file: "{{ ssh_key_path }}"

######################
# MongoDB minions
######################

# Create the database stack
- name: Create minion nodes
  ec2:
    aws_access_key: "{{ access_key_id }}"
    aws_secret_key: "{{ secret_access_key }}"
    instance_type: "{{ deployment.mongodb.instance_type }}"
    image: "{{ deployment.mongodb.ami_id }}"
    wait: yes
    region: "{{ target_region }}"
    assign_public_ip: yes
    vpc_subnet_id: "{{ item.1.subnet_id }}"
    group_id:
      - "{{ mongo_security.group_id }}"
    instance_tags:
      Application: "{{ application }}"
      Name: "MongoDBTier"
      Environment: "{{ deployment.environment }}"
      Deployment-Group: "{{ deployment.group }}"
      Type: MongoDBMinion
    exact_count: "{{ item.0.ensure_count }}"
    count_tag:
      Type: MongoDBMinion
      Deployment-Group: "{{ deployment.group }}"
      Environment: "{{ deployment.environment }}"
    zone: "{{ item.0.zone }}"
    key_name: "{{ ssh_key_name }}"
  register: mongo_boxes
  with_nested:
    - "{{ deployment.mongodb.azs }}"
    - "{{ mongo_subnets_to_azs }}"
  when: item.0.zone == item.1.zone

# Wait for SSH to come up
- name: Wait for SSH to come up on the minion boxes
  wait_for:
    host: "{{ item.1.public_dns_name }}"
    port: 22
    timeout: 320
    state: started
    search_regex: OpenSSH
    delay: 10
  with_subelements:
    - "{{ mongo_boxes.results }}"
    - tagged_instances

# Add hosts to the list for processing.
- name: Add the minion boxes to host inventory
  add_host:
    hostname: "{{ item.1.public_dns_name }}"
    groups: mongo
    private_ip: "{{ item.1.private_ip }}"
    ansible_user: "{{ ec2_ssh_user }}"
    ansible_ssh_private_key_file: "{{ ssh_key_path }}"
  with_subelements:
    - "{{ mongo_boxes.results }}"
    - tagged_instances

Summarily, this task creates the appropriate security group with the correct ports open for our mongo nodes, creates the leader node EC2 instance, and then wait for SSH to come up on the box before adding the leader to a special mongo_leader host group. If you run this, however, you’ll find that your local machine is never able to reach the leader node. So what gives? Well, when we created the VPC for all our resources we didn’t attach it to an internet gateway, so nothing can get in! We could easily automate this if we wanted, but I’d rather do this part by hand just so I know exactly when those boxes were exposed to the internet. Alternatively, we could create a bastion host to execute all our operational tasks, but that’s for another post. Use the AWS console to open up your VPC as seen below.

If you re-run the base script it should contact the leader node successfully and continue on to create all of the minion EC2 nodes, associate them with their appropriate subnets by correlating the AZ to the subnet ID we built in the init_vpc role, wait for SSH to come up, and finally add each minion to the mongo host group.

Installing mongo and configuration

Now that our boxes are up and listening we need to install mongo on all our nodes.

roles/install_mongo/tasks/main.yml

---
# Copy the repo file to /etc/yum.repos.d
- name: Copy the repo file for mongo
  template:
    force: yes
    src: mongodb-org-3.2.repo
    dest: /etc/yum.repos.d/

# install and configure MongoDB
- name: Install MongoDB
  shell: yum install -y mongodb-org

- name: Copy the configuration file template to the remote host
  template:
    force: yes
    src: mongod.j2
    dest: /etc/mongod.conf

- name: Start the mongo daemon
  service: name=mongod state=restarted

The fine folks over at mongo HQ have provided a yum repo definition, so we simply copy that to /etc/yum.repos.d/ and run yum install -y mongodb-org to pull down and install mongo. There are a couple ways to start a mongo node while defining a replica set, and we’re going to simply template mongod.conf with the appropriate values filled out and ship it to all our EC2 instances. I’ve extracted the relevant bits of the template below.

roles/install_mongo/templates/mongod.j2

# network interfaces
net:
  port: 27017
  bindIp: [127.0.0.1,]

replication:
  replSetName:

It’s a good practice to ship configuration files like this to distributed and clustered systems because it gives you more control over the relevant properties of that system through jinja templating. In the snippet above we’re replacing configuration values based on facts gathered during the playbook’s execution. We’re dropping in the replica set name verbatim from the deployment blob, but notice we’re doing something more interesting in the bindIp property. Bandwidth on internal IP addresses is free inside AWS, so you should always bind to internal addresses for communication between nodes in clustered and distributed systems. Here we’re telling the mongo daemon to do just that by listening only on localhost and the private IP assigned by AWS.

We’re almost there. The last thing we need to do is start the mongo daemon via the service command and then ship our replication initialization script to the leader for execution. Starting mongo is covered in the install_mongo role as seen above, and initializing replication is handled by shipping a template javascript file to the leader host.

roles/init_mongo_replication/tasks/main.yml

---
# This task executes the required steps to initialize mongo's
# replication on the hosts provided. It is very much recommended
# to limit this task to executing on a SINGLE mongodb box and not
# all of them. Else, you will get errors.
- name: Copy the initialization script to tmp
  template:
    src: init_replication.j2
    dest: /tmp/init_replication.js

- name: Execute the initialization script and add all replicants
  shell: mongo /tmp/init_replication.js

The jinja template for the initialization script looks a bit cryptic at first.

roles/init_mongo_replication/templates/init_replication.j2

rs.initiate()
sleep(13000)

{% if groups['mongo'] is defined %}
{% for host in groups['mongo'] %}
rs.add("{{ hostvars[host].private_ip}}")
sleep(8000)
{% endfor %}
{% endif %}
printjson(rs.status())

For each mongo minion as defined in the mongo host group we’re building a call to rs.add() containing the private IP of that minion which will tell the leader who to include in the replica set upon initialization. After that we sleep for a bit to give everyone a chance to talk to each other and stabilize. Once that’s complete we’re done!

Pulling it all together

In true Linux Geek fashion the overall plan is to automate as much as possible. In this example I’ve built the hopefully cleverly named install-mongo-in script for kicking the entire deployment process off.

install-mongo-in

#!/bin/bash
if [ $# -ne 2 ]; then 
  echo "Illegal number of parameters. AWS region and environment required!"
  echo "Usage: install-mongo-in us-east-1 production"
  exit 1;
fi

# Array of required builds
declare -a BUILDS=("default-install")
MAIN_PLAYBOOK="mongo-in-vpc.yml"

# First let's make sure they all do in fact exist, otherwise we could waste
# a lot of time waiting on a partial build to complete.
for build in "${BUILDS[@]}"; do
  if ! [ -e "./regions/mongodb_$1_$build.yml" ]
    then echo "Required file for region $1 and deployment $build doesn't exist. Cannot continue."
    exit 1;
  fi
done

set -e
for build in "${BUILDS[@]}"; do
  ANSIBLE="ansible-playbook --extra-vars=\"target_region=$1 env=$2 deployment_group=$build\" --vault-password-file=vault-password.txt $MAIN_PLAYBOOK"
  echo "Running command: $ANSIBLE"
  eval "${ANSIBLE}"
done
unset -e

This script is a simple wrapper around the ansible-playbook command that allows us automate multiple deployments of the playbook as needed. Note in the implementation above we have an array of builds that are passed into the — extra-vars switch in a loop. This mechanism provides the ability to install multiple mongo stacks in the same region without conflicting with one another.

Scaling

Because Ansible attempts to be idempotent (and mostly succeeds), executing the same playbook against any region with the same arguments should result in zero changes to the stack. What if, however, we wanted to add or remove a node from our deployment? It’s as easy as modifying the appropriate ensure_count value in the availability zone you’re interested in — or you can even add a brand new AZ to the mix and the scripts should reconfigure the stack correctly and rebuild the replica set. If you lose a node due to an outage simply re-run the original script with no changes and Ansible will happily repair your cluster. Pretty cool.

For nicer syntax highlighting view this post on my own site.