FinOps for AWS

Venkatesh Muthusami
5 min readSep 1, 2023

--

Introduction to AWS Cloud FinOps

As cloud usage started increasing, Customers have been seeking support to closely monitor and optimize their cloud spend. This pack dives deeper into the tools and mechanisms that Amazon Web Services (AWS) offers towards Cost management. AWS cloud cost management tools create their own FinOps mechanism and best practices to help businesses keep their cloud spending aligned with their business objectives and manage costs.

Cloud FinOps includes four principles:

Extract Infra Consumption details from:
* AWS Budgets
* Cost Anomaly Detection
* AWS Cost Explorer (Daily/Monthly Cost Reports)

Explore Cost Optimization opportunities from:
* AWS Trusted Advisor
* Amazon EC2 Processor Efficiency
* Amazon Compute Optimizer
* AWS VPC endpoints

Configure Forecasting and alerting using
* AWS Budgets

Implement changes for:
* AWS Budget Actions
* Anomaly detection findings
* Trusted Advisor and Compute Optimizer Recommendations

AWS Budgets and Cost Anomaly Detection for Alerts

AWS Budgets provides the ability to set custom budgets that alert when the costs or usage exceed (or are forecasted to exceed) budgeted amount. Budget alerts can be sent via email and/or Amazon Simple Notification Service (Amazon SNS) topic.
Below is a sample mail with the Budgeted amount with Threshold $ values and Forecasted $ amount at the end of month

Cost Anomaly Report provides the summary of unusual AWS usage patterns for accounts in the AWS organization.
* It detects rare occurrences that seem suspicious because they’re different from the established pattern of behaviors.
* Cost Anomaly Detection is accessible from the AWS Cost Management from the console.
* Alert Subscription can be created with the list of email-ids whoever it may be concerned.
Below is a sample mail with recent list of anomalies that have been detected on EC2 with corresponding root cause(s).

Amazon Trusted Advisor for Cost Optimization opportunities

AWS Trusted Advisor scans AWS infrastructure, compares it to AWS Best Practices in five Categories(Cost Optimization, Performance, Security, Fault Tolerance, Service Limits) and provides recommended actions. This slide covers the Cost Optimization category.

AWS Trusted Advisor provides recommendations that help Cost Optimization:
* Trusted Advisor evaluates AWS account by using checks. These checks identify ways to optimize AWS infrastructure and reduce costs.
* Build a Lambda function to extract Trusted Advisor data across the AWS Organization for Cost Optimization category
* The function is executed every month and the recommendations are addressed
Below are the most common Cost Optimization opportunities and the recommendations followed at Directv to optimize the services and resources.

The graph shows Monthly Savings Opportunities:
$ savings for various Checks month-on-month
$ savings across AWS Accounts for checks

*The data is extracted from Trusted Advisor on regular basis and the recommendations are addressed
* The check with higher $ value is prioritized to realize the cost benefit immediately
* Ready-made solution (such as Lambda function to stop RDS instances during off peak hours) is customized and implemented.

Amazon Compute Optimizer for Right Sizing and Cost Optimization

Amazon Compute Optimizer provides savings overview, performance enhancements, and optimization recommendations
AWS Compute Optimizer helps avoid overprovisioning and under-provisioning four types of AWS resources — Amazon Elastic Compute Cloud (EC2) instance types, Amazon Elastic Block Store (EBS) volumes, Auto Scaling Group (ASG), and AWS Lambda functions — based on the utilization data

Example for the EC2 instances/Auto Scaling Group recommendations:
* CPU/Network Bandwidth Overprovisioned:
* Current instance type: t3.2xlarge ($0.3328 per hour)
* Recommended instance type: r6g.xlarge ($0.2016 per hour)
* Savings Opportunity: 39%
Example for EBS Volumes recommendations:
* Migrate the volumes from gp2 to gp3 type for better performance & Cost
* Savings Opportunity: 20%
Example for Lambda recommendations:
* Memory Overprovisioned
* Current configured memory: 160 MB ($0.00015)
* Recommended memory: 128 MB ($0.00014)

A Lambda function can be built to extract Compute Optimizer data across the AWS Organization. The function is executed periodically to identify any new opportunities and address the recommendations.

AWS VPC endpoints eliminates Data Transfer Out & Cost Incurrence

VPC endpoint enables private connections between VPCs and supported AWS services and VPC endpoint services powered by AWS PrivateLink. There are two types of VPC endpoints. Interface endpoints and Gateway endpoints

*An interface endpoint (AWS PrivateLink) is an elastic network interface with a private IP address. It serves as an entry point for traffic destined to a supported AWS service or a VPC endpoint service
* Application logs to CloudWatch were passing thru internet via NAT Gateway and the data had been transferred out of AWS to reach CloudWatch.
* The approach was changed to introduce an Interface endpoint for CloudWatch for streaming the application logs to CloudWatch using Interface endpoints, instead of passing thru internet via NAT Gateway.
* The interface endpoint has been created across all the application accounts to stream the logs to CloudWatch without hitting the internet.
* This approach eliminates the data transfer out of AWS and avoids any huge cost incurrence as shown in the snapshot for NAT Gateway.

Conclusion

Customers face higher cloud costs after the migration of its applications to the cloud. Comprehensive solution can be offered by adopting FinOps best practices and implementing cost optimization measures, resulting in a significant reduction in overall cloud expenditure. Through the implementation of the FinOps framework, the solution can provided a transparent visibility into cloud spending, empowering engineering and business teams to make well-informed decisions. Periodic cost tracking, can enable app owners to identify and address cost leaks. Furthermore, by leveraging Amazon’s Compute Optimizer, compute resources can be appropriately sized to improve efficiency, and Amazon EC2 processor efficiency approach can be leveraged to change the non-production instance types from Intel to graviton or AMD.

--

--