Expedia Group Tech — Software

Auto-Healing MongoDB cluster in AWS

How to set up a MongoDB cluster that is resilient to hardware failures and simplifies maintenance

Zubin Sharma
Jul 14, 2020 · 6 min read
The word “Database” superimposed on three people holding computing devices.
The word “Database” superimposed on three people holding computing devices.
Source: Adobe Stock

Why is Auto-Healing required?

If you want to set up a database cluster in AWS, using EC2 instances is a popular way of doing so. But these instances are ephemeral and are occasionally lost or need to be replaced due to:

  • Unplanned activities like instance termination by AWS to handle degraded performance/failed hardware

What makes it possible?

All distributed databases and cloud providers provide somewhat similar features, hence it’s possible to use this approach to implement auto-healing for different distributed databases across multiple cloud providers.

MongoDB features

  • MongoDB servers in a replica set (RS) replicate data among themselves to provide redundancy. In the event of a server getting reconnected after being disconnected for some time, the replica set is able to detect how far behind the server is and automatically sync the remaining data.
  • Each individual shard/partition can be set up as a replica set. Network partitioning in one shard’s RS does not impact any other shard.

AWS Features

  • Elastic Block Store (EBS) provides a way to segregate data from compute. Termination protection on EBS volumes prevents them from getting deleted when the instance they are attached to gets terminated.
  • If an autoscaling group spans multiple Availability Zones (AZs), AWS ensures uniform distribution of servers across all of them. This means that a replacement for a server in uswest2-a is also launched in uswest2-a.
  • Route53 is a DNS service. We create an A record having the hostname and IP address in Route 53 for each server in the cluster.
  • Tagging provides a simple way to associate properties with resources, which can also be used to identify those resources.

How is auto healing achieved?

The solution is presented here in three sections: The first two sections show how we set up the cluster topology across EC2 instances and then configure each instance. The last section details how auto recovery is achieved and how we reduced the recovery time.

Cluster topology setup

We used AWS CloudFormation to provision EC2 instances. The same may be achieved through other solutions like Terraform.

Arrangement of replica sets across Availability Zones
Arrangement of replica sets across Availability Zones
Arrangement of replica sets across Availability Zones
  • Each of these have their own Autoscaling Groups and Launch Configurations. The min and max value in each ASG were set to 3. Hence when an instance gets terminated, its replacement is brought up by AWS Autoscaling.
  • Each of the 3 servers in the ASGs are set up in a different Availability Zone (AZ) (uswest2-a, b and c).

MongoDB process setup

Once an EC2 instance is launched (either when the cluster is first set up or during a replacement instance launch), the following workflow configures the instance.

MongoDB setup flow chart. Steps are explained next.
MongoDB setup flow chart. Steps are explained next.
MongoDB setup workflow on an EC2 instance
  1. AWS calls the initialization lifecycle hook ( UserData script in the Launch config)
  2. The hook downloads a bootstrap script stored on S3.
  3. The bootstrap script installs Ansible and its dependencies.
  4. The bootstrap script configures the instance by running the Ansible playbook in local mode.
  • Each Launch Configuration passes its details like shard name, etc. as parameters to the bootstrap script, which are then used in the Ansible playbook.

Putting it together

Due to the arrangement of our topology and the fact that AWS autoscaling launches a replacement in the same AZ as the original, any newly launched server knows its coordinates in the cluster. By coordinates, I mean which Shard and Availability Zone it belongs to. How this information is used is mentioned below.

  • The Ansible playbook also updates the server’s IP address in its A record in Route53. The config for each RS contains host names of its members, rather than their IPs, and any replacement server gets automatically added to the relevant RS without manual intervention.
  • Because we use MongoDB replica sets, MongoDB does the job of replicating missing data in the newly added instance. Once the sync completes, the instance starts to function as expected.
  • In the extremely unlikely event that all 3 members of a replica set get terminated simultaneously, EBS termination protection ensures that the data is not lost, and the cluster can restore to a last known good state. Not using EBS volumes (with termination protection) would have resulted in loss of data.

The proof is in the pudding

We deployed this solution about 18 months ago. In that time, we’ve had approximately 6 instance failures in our cluster. Our deployment was able to handle all of these with ZERO downtime, ZERO data loss and ZERO manual intervention.

Expedia Group Technology

Stories from the Expedia Group Technology teams