Pairing Heroku with AWS RDS

April Dagonese
Extra Credit-A Tech Blog by Guild
5 min readJan 30, 2020

For the past several months, my team here at Guild has been working on breaking a user data application (the User Profile Service, or UPS) out of its legacy code and into a new microservice. Not only was that older codebase bloated and difficult to work with, but various applications were storing user data differently, which meant that the same information could be requested from a user multiple times on our site — obviously not a great user experience. We wanted a single source of truth that could be consumed by multiple applications, giving users a more consistent experience.

In terms of technical requirements, most of Guild’s existing product architecture is hosted on Heroku, but the company has wanted to push more architecture into AWS. Particularly for our databases, AWS provides better privacy constraints, more specific access control, and more detailed logging. We also expect it to be cheaper for us as we grow, compared to the cost of Heroku’s Private Space. Plus, it’s really really fast. For all these reasons, but especially because UPS deals with sensitive user data, keeping its database in AWS was a clear choice. We decided to host the application in a Heroku private space and back it against an AWS RDS database running the Postgres flavor of Aurora.

With very little hands-on experience in AWS networking or VPCs, peering our Heroku private space with the RDS instance turned out to be a lot of trial and error. (Of all the weeks for our resident DevOps, Trey, to be out of office at the AWS re:Invent conference!) And, like so many trial-and-error experiences, this one looks a lot clearer in retrospect.

Amazon’s tutorial for setting up an RDS database is pretty straightforward, but it includes several tricky concepts if you’re new to AWS. Thankfully, Trey had gotten most of this setup in place prior to leaving, but that also meant we had to figure out where to pick things up from. With enough clicking around, we understood that our RDS database had been created inside a private subnet of Guild’s production VPC. (Because the DB needed to be accessible only to the UPS application itself.) The VPC already had a peering connection with Guild’s private space in Heroku, as well. At that point, we figured our setup looked something like this:

Our next move was to add the postgres connection url of our new RDS instance into the Heroku application. The database url, which we pieced together from several different sources, ended up following this format:

postgres://DbUsername:DbPassword@RdsEndpoint:5432/DbName?sslCert=pathToCert&sslMode=verifyAll

RDS supports SSL encryption between source applications and its Postgres instances, and we wanted to take advantage of that feature. (That’s the “sslCert=” option on the end of the url.) We found a buildpack that would automatically pull the latest AWS certificate whenever our app deployed, which we forked and included on the Heroku instance.

We then pasted our completed database url into our app’s environment variables, held our breath, and waited for the Heroku build to deploy… only to be greeted by our constant companion for the next 24 hours, the Postgres `Connection refused` error:

Error: psql: could not connect to server: Connection refused Is the server running on host <RDS ENDPOINT> and accepting TCP/IP connections on port 5432?

After lots of Googling, and the help of Heroku’s peering docs in particular, it appeared that we needed to work with our Route Tables and Security Groups. We grabbed our assigned block of IP addresses (a.k.a. CIDR block) from the Heroku private space and added it into the main Route Table on the VPC, connecting it to the VPC peering instance. Then we created a new Security Group and allowed incoming traffic on the default Postgres port (5432) for the CIDR block. The result?

Error: psql: could not connect to server: Connection refused Is the server running on host <RDS ENDPOINT> and accepting TCP/IP connections on port 5432?

Over and over again.

By about 20 `Connection refused` errors, we had roped in our colleague, Ed, as well, and we were getting pretty desperate. Just when we were ready to admit defeat and wait for Trey to come back the following week, Ed’s Googling turned up the idea that a main route table would be overridden by any non-main route table applied to our resources. This struck a chord with us, because one of our many experiments had included opening up the main route table to all traffic, just to test whether that would allow our database to connect. The result of that test, like so many others, had been our dark passenger:

Error: psql: could not connect to server: Connection refused Is the server running on host <RDS ENDPOINT> and accepting TCP/IP connections on port 5432?

That was when we looked at the database and its subnet more closely, and sure enough, the subnet had a non-main route table attached. That meant that none of the routes on the main route table were being applied, which is why opening up to all traffic hadn’t changed anything. Once we added a CIDR block route to the subnet’s route table, it was goodbye dark passenger!

Now that we’re done making changes and improving our understanding, we’re pretty sure our architecture looks more like this:

Full disclosure: this whole process took about 2.5 days for Tyler and me to figure out, and it was fairly frustrating along the way. But now that it’s done, we’ve been really happy with the results. This database is fast. And knowing the basics of networking across VPCs has been really helpful in understanding other architectural possibilities as we expand into AWS. Plus, we all know that the most satisfying technical wins are the ones that make you struggle. Let’s just say that many ‘lawd baby’ gifs were exchanged once that DB finally connected.

--

--

April Dagonese
Extra Credit-A Tech Blog by Guild

Software engineer, aspiring real-estate investor, questioner of everything