Creating MongoDB Backup in S3

Emmanuel Lodovice
Uncaught Exception

--

If your application is deployed in AWS and you are using MongoDB as your database, there is a good chance that you are running and managing an EC2 instance that you setup manually. Unlike AWS RDS that allows you to configure how often you want to backup your database and just does it for you, your EC2 instance running MongoDB does not have this capability out of the box. You will have to setup the backups yourself. S3 is a good and reliable place to store your backups and is very easy to upload files to S3 using the AWS CLI.

Here are a few simple steps to generate a MongoDB backup and upload it to S3.

Setup

Make sure you have an EC2 instance that you can ssh to that has access to the database. Ideally, you want a separate EC2 instance for this instead of using the instance running the database so you can easily reboot it when you mess up. Make sure mongo in installed in the EC2 instance so you have the mongodump command available. Also make sure AWS CLI is installed as well and configured with the right user that has read and write access to your S3 bucket where you want to put your backup.

  1. Generate the backup using mongodump
    To generate a dump of a database in MongoDB you can run the command mongodump --host <ip> --port <port> --db <db_name> --username <username> --password "<password>"
    IP — the IP of the EC2 instance running mongodb, this can be 127.0.0.1 if you are trying to backup your local database
    port — the port where mongodb is running, this is typically 27017
    db_name — the name of the database you want to dump
    username — the username of the mongo user that has access to the database
    password — the password of the mongo user
    You will typically find all these values in the URL that you use in your application to connect to your database since it is typically in the format mongodb://<username>:<password>@<IP>:<port>/<db_name>
    The command will create a folder named dump containing a bunch of JSON files.
  2. Compress the backup to a single zip file
    You want to compress the dump folder so it is easier to upload and download from S3. You will want to name the zip file with something informative like the current datetime. To get the datetime in YYYYMMDDHHMMSS format you can run the command date +%Y%m%d%H%M%S . You can then run the command tar -czvf <datetime>.tar.gz dump to generate a zip file of the dumpfolder.
  3. Upload the compressed file to S3
    You can use the AWS CLI to upload the newly generated zip file to S3 using the command aws s3 cp <datetime>.tar.gz s3://<bucket_name>/ . Go visit the S3 bucket in your AWS console to check if the file is really uploaded.
  4. Clean up
    Delete the generated zip file using the command rm <datetime>.tar.gz and the dumps folder using the command rm -rf dump/ . This is important because your EC2 instance has very limited storage.
  5. Write a script that does steps 1 to 4 and run it periodically
    Go learn a scripting language and how to run cron jobs :D

--

--