Photo by Markus Winkler on Unsplash

Part 3: Identifying EC2 Compute Costs

Christopher Harris
Understanding the AWS CUR
3 min readMay 28, 2024

--

My name is Christopher Harris, and I am a maintainer on the FinOps FOCUS project. I’ve built cost management products at Datadog and CloudHealth (Broadcom) over the past 8 years, and I’m excited to share some of what I’ve learned with you.

Virtual Machines, Dedicated Instances, Dedicated Hosts, Bare Metal Instances, Spot Instances, Reserved Instances … It’s pretty simple to identify EC2 compute costs within the AWS CUR, right?

Image from liveabout.com

It took various iterations to trust a SQL filter to encapsulate all EC2 compute charges. Here was my iterative process to success.

Approaches

Below are some of my initial approaches before finally reaching out … yes, actually reaching out to AWS’ Billing team for help around this definition. All examples below are written for AWS Athena.

Iteration 1: AmazonEC2 usage rows ❌

WHERE
line_item_product_code = 'AmazonEC2' AND
line_item_line_item_type LIKE '%Usage'

This was my first, naive approach. We all know that this statement encapsulates additional resources like EBS volumes and additional operations like networking … moving on.

Iteration 2: EC2 Compute VM rows ❌

After filtering out non-compute line items, my definition transformed into:

WHERE
line_item_product_code = 'AmazonEC2' AND
line_item_line_item_type LIKE '%Usage' AND
line_item_usage_type LIKE '%BoxUsage%'

While this statement narrows down to EC2 compute line items, it only narrows down to EC2 virtual machines, and additional EC2 compute instances like Dedicated, Bare Metal, and Spot instances are omitted.

Iteration 3: More EC2 Compute rows

While BoxUsage remains the most widely used usage type (duh, VMs are the cheapest), I found additional usage types like SpotUsage, DedicatedUsage, and HostUsage, so my definition transformed into:

WHERE
line_item_product_code = 'AmazonEC2' AND
line_item_line_item_type LIKE '%Usage' AND
REGEXP_LIKE(line_item_usage_type, '*(Box|Spot|Dedicated|Host)Usage*')

Not convinced this definition was bulletproof, I did additional digging through AWS’ documentation and found even more usage types.

Iteration 4: MORE EC2 Compute rows ⚠️

After digging around in AWS’ documentation, I found this article with additional lineItem/UsageType substrings:

  • HostBoxUsage
  • ReservedHostUsage
  • SchedUsage
  • UnusedBox

So, I amended my filter with the additional usage types:

WHERE
line_item_product_code = 'AmazonEC2' AND
line_item_line_item_type LIKE '%Usage' AND
(REGEXP_LIKE(line_item_usage_type, '*(Box|Spot|Dedicated|Host|HostBox|ReservedHost|Sched)Usage*') OR
line_item_usage_type = 'UnusedBox')

Still not convinced, given the article says, “contains only”, I wasn’t sure what to do next.

Iteration 5: Top-down definition ✅

I finally contacted Chris Strzelczyk, an AWS TAM and a CUR expert, who pointed me to this definition for filtering all possible EC2 compute instance charges from the AWS CUR Query Library.

WHERE
line_item_product_code = 'AmazonEC2' AND
line_item_operation LIKE '%RunInstances%' AND
line_item_usage_type NOT LIKE '%DataXfer%' AND
product_servicecode <> 'AWSDataTransfer' AND
line_item_resource_id NOT LIKE 'arn:%:capacity-reservation/%'

Come on AWS!

The most widely-used AWS service should have the most straightforward classification!

Is it too much to ask AWS to provide product dimensions like:

  • product/isComputeInstance → yes, no
  • product/computeFamilyName → Virtual Machine, Dedicated Host, etc.

Hopefully, this story saves someone else the same effort I went through.

Maybe someone at AWS reads this story too? 😃

If you’ve discovered additional heuristics for classifying only EC2 compute rows, please comment on this story!

Check out my other stories about Understanding the AWS CUR.

--

--

Christopher Harris
Understanding the AWS CUR

I am a maintainer on the FinOps FOCUS project and have built cost management products at Datadog and CloudHealth (Broadcom) over the past 8 years.