Connecting to an AWS DocumentDB database from outside its VPC

Luiz Mai
Red Ventures Brasil - Tech
6 min readMay 18, 2020

Introduction

If you ended up reading this article, you're probably interested in DocumentDB topics or you're facing some problems while trying to connect to it. In this article, I'll be giving you some context behind our application and how we ended up creating the infrastructure for connecting to the database from outside its VPC.

Feel free to reach out if you have any doubts, suggestions or issues while setting up your own environment.

Glossary

During this article we'll talk a lot about some words/expressions, so here are some basic concepts to clarify your mind if you're not familiar with these terms.

  • AWS DocumentDB: MongoDB-compatible and fully managed document database service provided by AWS. It’s important to say here that DocumentDB IS NOT a MongoDB database: even though you can run most of Mongo commands on it, there are some limitations that I highly suggest you to read here before starting using it.
  • AWS Region: physical location around the world where Amazon clusters datacenters.
  • AWS VPC: virtual network dedicated to an AWS account that is logically isolated from other virtual networks in the AWS Cloud.

Here's some context!

DocumentDB is a not-so-famous database and you may be thinking why we chose to use it within our application. First of all, out motivation is that currently all of our infrastructure is being managed by AWS and we didn't want to have anything outside it. Second, our application deals with user profiles, each one with a lot of different data based on its nature, so we needed a non-relational database.

Sure, so why not install Mongo in an EC2 instance?

Well, we didn't want to manage it ourselves — whenever a problem happened we'd need to stop and investigate what happened to our Mongo instance, so we decided to use a fully-managed solution: DocumentDB.

Before we continue, it's important that you understand our current environment so the next sections make sense: currently, all of our application resources are in the sa-east-1 AWS region within an AWS account we're calling ApplicationAccount. In the other hand we have a Spark cluster that we run inside Databricks, which runs in sa-east-1 as well, but in other account we're calling SparkAccount. That said, we'd need to connect to our database in us-east-1 (since there's no DocDB on sa-east-1) from a VPC in other region and from a VPC in other account.

First things first

Ok, after all this introduction, let's start creating our database! First of all, we go to the DocumentDB homepage within our ApplicationAccount and Launch Amazon DocDB. But wait, there’s no DocumentDB in the sa-east-1 region, so we need to choose another region (we used us-east-1 here). Choose a name for your cluster, as well as the number of instances and their classes — since we're keeping it simple, we'll choose the most basic configuration as possible.

Specifying basic DocumentDB configurations

Also, choose an username and a password for your database and remember to save it somewhere so you don't forget it!

Setting up credentials for your cluster

If you're already familiar with AWS, you can check the Show advanced settings toggle so you can specify some parameters such as subnets, security groups etc. Otherwise, just leave everything with the default values.

By leaving everything as default, your document DB will be created inside the default security group and subnet group, which guarantees full access from and to your cluster. To avoid facing problems with certificates when connecting, we'll disable TLS by going to Parameter Groups in the left sidebar, choosing TLS, clicking Edit right above it and disabling it.

NOTE: if your application needs encryption in transit, you're encouraged to leave it enabled!

Now, if you go to your cluster details screen, you'll see something like this:

Connection instructions

Great, now we have an endpoint to connect, so we go back to out application — or whatever you have that needs to connect to it — create a mongo connector (which is not in the scope of this article), tries to connect to the database and… ERROR. If you remember what we mentioned above, our application runs in a different VPC than the DocumentDB cluster, which only allows access from the same network, so here's how we managed to fixed it.

VPC Peering onwards!

To allow communication flow between our application VPC and the DocumentDB VPC, we implemented a peering connection between both in few steps:

Configurations used when creating a VPC Peering request
  1. Send a peering request: from one of your VPCs, navigate to the VPC Dashboard and in the left sidebar, click in Peering Connections, then Create Peering Connection. In this screen, you'll need to specify details from the other VPC such as the account ID, the region where it is and the VPC ID.
  2. Accept the peering request: navigate to the other VPC (in other account, region etc), go to the Peering Connections menu as well and you'll see there's a Peering Request there. Check it, go to the Actions panel and click Accept Request.
  3. Update your route tables: in both VPC (B-O-T-H!) you'll need to allow traffic from that peering connection to your VPC using route tables, so navigate to your VPC dashboard, go to the Route Tables menu in the left sidebar, select the table associated to your subnets, click in the Routes tab right below it and then Edit Routes. There, you'll need to include the Peering Connection ID (it will give you a list of current connections so you can choose it directly) and the other VPC CIDR. To retrieve the CIDR, go to the other VPC region/account, navigate to the VPC details screen and you may see a IPv4 CIDR associated with it. That's what you have to put in your new route. Repeat the same steps for the other VPC and your route tables will be fine.

Ok, so peering connection is OK, route tables are setup and if we try to connect to our database, IT WORKS! Even though it works perfectly, here are some issues I faced that may happen with you as well:

  1. If your database runs in private security groups, you'll need to allow traffic from the other VPC CIDR in the security group inbound rules as well in the port your database runs (by default, 27017).
  2. When using an Infrastructure as Code provider such as Terraform, make sure to create the parameter group as well (when creating via interface, the parameter group is created by default).
  3. If you enable encryption in transit (TLS), connecting to your database will require additional steps to setup correctly using the certificate.

tl;dr

Summing up everything we did here in few steps:

  • Create the cluster with default configurations
  • Create a VPC Peering connection between both VPCs
  • Add a route table entry in both VPCs with the other one's CIDR
  • Connect to it!

Conclusion

Hopefully your DocumentDB is running properly and you can connect to it and query it as you want. If you still have any problem with it or if you missed something in this article, feel free to reach out! I'd also love to hear from you if you faced any other issue not listed here so I can include it here.

--

--