Estimating AWS Infrastructure Cost from Terraform Templates

I had this idea while prototyping some infrastructure management with Terraform. Wouldn’t it be useful to know how expensive your infrastructure was going to be before you launched it? With tools like Terraform, CloudFormation, and others, you can “codify” your infrastructure, and having infrastructure that can be parsed with code should allow us to do exactly that price estimation. I had assumed AWS had a pricing API when I set out on this task, only to discover that I basically had to write my own.

AWS does not make price estimation easy for you. They do have something that they call a Price List API, however, it’s not something you can query directly. You can get all the pricing data for a single service, such as EC2, but that’s a relatively large data set. If you want to get the pricing data for a particular size of instance, in a particular region, for example, that’s not directly possible. You have to get all the data, even if you’re only interested in a small part of it.

I ended up writing three projects to accomplish this. First, I needed a way to ingest the pricing data into a database, this simple python script does that for me: https://github.com/Bjorn248/aws_pricing_data_ingestor. The reason I used MariaDB instead of Postgres was because the CSV provided by AWS was not directly compatible with LOAD DATA LOCAL INFILE in Postgres, because of the way the CSV was quoted. MariaDB does the job just fine. One thing I noticed while writing this was that the pricing data in AWS was actually updated quite frequently. They are constantly adding new services and columns to existing tables, so I had to implement schema generation inside the ingestion script to be able to keep up with AWS. The benefit of this is that I won’t have to change the script or any static schema going forward, and if AWS adds any new services, my script should work just fine. This script runs daily in lambda to ingest the latest pricing data into a RDS MariaDB instance.

The second project I wrote was an API to expose the data in MariaDB. The code can be found here: https://github.com/Bjorn248/graphql_aws_pricing_api and the publicly accessible endpoint can be found here: https://fvaexi95f8.execute-api.us-east-1.amazonaws.com/Dev/graphql/. You can send POST requests to that with valid graphql request bodies and get real pricing data! It takes about 10 seconds to warm up lambda so be patient for that first response. I decided to use GraphQL because that would allow the client to use all the flexibility offered therein to tell the API which data it needs and how it needs it. I also had to do GraphQL schema generation inside the API, which happens every time it is launched. I know that I did not write very well optimized code here, and would appreciate any feedback, but it does the job for the purpose of getting data for Terraform cashier, the tool that started this whole journey. Another thing that came to mind as I finished the API was that anyone can use the API for their own project. They could write a similar tool for CloudFormation, or use it in ways that I had never dreamed of. The API was probably the best and most generally usable thing that I made during this process, despite its disappointing lack of performance optimization.

The third and final leg of the journey was Terraform Cashier, the tool that I wanted to write in the first place. The code for the cashier app can be found here: https://github.com/Bjorn248/terraform_cashier. This parses the output file of a terraform plan, as suggested by PatrikKarisch. My goal was to estimate cost before launch, not after. Many existing tools, even AWS themselves, tell you how much things cost after you launch them, so I figured the real value would come from the ability to estimate cost before launching anything.

For now, my journey into AWS price estimation from Terraform templates has ended. I’ve decided that the three projects that came out of it are stable enough for my personal use but I thought I’d share the journey and use this opportunity to gauge the interest for pre-launch price estimation from codified infrastructure and the pricing API.