AWS Cost-Savings Tips

Matt Weingarten
3 min readJan 5, 2024

--

This could all be yours

Note: This is a post written in collaboration with the people at Hevo Data. Definitely check out their platform if you’re interested!

Introduction

In my previous post, I discussed the role that data engineering plays in the FinOps journey. With applications that consume and produce large data sets, data engineers must be aware of best practices to ensure that their costs don’t explode.

I wanted to take some time to focus on some of the core DE services and tools themselves, and that’s why we’re looking at AWS today. A lot of the below points will reference a post from around this same time last year, but they’re certainly worth revisiting.

S3 Lifecycle Rules

When files are written to S3, they stay there permanently (as they should). By default, these objects stay in the Standard storage class, which means they’re quick to retrieve, but their associated storage costs are the most expensive. Unless it doesn’t make sense in a business context, it’s always advisable to apply lifecycle rules to your S3 buckets so that older objects are either moved to lower storage classes (where the retrieval time is longer but the storage costs are much cheaper) or remove them entirely.

Sometimes, it’s hard to know what actions to take on your S3 buckets. After all, do you truly know how your stakeholders are using your data? That’s where something like intelligent tiering can come into play. Intelligent tiering monitors all the objects in your bucket to figure out the standard access patterns, and then it creates appropriate rules afterwards. This is a great safeguard to have.

For lower environments, you likely want to have lifecycle rules enabled by default, as that’s usually not data you need to preserve for long periods of time. You’ll also want to make sure you run your lifecycle suggestions by your data governance team. If you’re working with sensitive data, you might need to have certain policies in effect by default to ensure that PII data can’t stay active for long periods of time.

Compute Services

Most compute services within AWS use EC2, so the key principles for ensuring savings can be applied to a lot of different aspects. Here are some tips for success:

  • Spot nodes: Spot nodes are AWS’s spare compute, generally coming at a significant discount compared to on-demand prices. The main caveat with Spot is that those instances can be ripped away from you at any moment, so you usually want to avoid using them for your most critical applications.
  • Graviton: Graviton is AWS’s latest processor and is continually being refined into the best offering on the market these days. The cost to performance ratio is usually a great improvement compared to traditional EC2 instances, so it’s worth taking advantage of if the conversion process isn’t too difficult. Note that Graviton is being supported in a bunch of different services, such as Lambda and RDS as well.
  • gp3: gp3 EBS volumes are AWS’s latest offering for EBS and definitely is the direction to go (although getting gp3 into EKS can be a bit of a challenge). EBS costs can become a problem if you’re not being careful, so it’s good to switch to gp3 as a general safeguard.

Miscellaneous

In addition to the above, make sure that you have a proper tagging standard set up for all of your AWS resources. Storage and compute for the same application should be tagged accordingly so that their costs roll up appropriately. Make sure that you’re aware of all of AWS’s dashboard offerings for drilling into cost data as necessary.

You can also establish budgets in AWS (grouped by tags) and get alerted when those budgets are exceeded or on track to be exceeded. Having this anomaly detection in place gives you peace of mind to keep developing, knowing that you’ll be prompted into action when something actually does come up.

Conclusion

AWS is a massive platform and can be hard to understand at first, but there are a lot of different ways you can save money while using it. There is plenty of great documentation out there with tips as well, so make sure to go out and find it when you start developing. Happy saving!

As a reference, here are some prior posts on other AWS cost-optimization tips:

--

--

Matt Weingarten

Currently a Data Engineer at Samsara. Previously at Disney, Meta, and Nielsen. Bridge player and sports fan. Thoughts are my own.