Master Your Finances: Seamlessly Transfer AWS Cost and Usage Report to Snowflake for Epic Insights

Priyadharshinee Rajaganapathi
BI3 Technologies
Published in
6 min readJul 12, 2023

AWS cost and usage reports give data of the estimated charges linked to your AWS account. Each report contains line items for each unique combination of AWS products, usage types, and operations in the AWS account.

Are you searching for a hassle-free solution to transform cost and usage reports into actionable insights? Then you are in the right place. This blog will show you how to export your AWS cost and usage reports to Snowflake.

Follow the below steps to load cost and usage reports into Snowflake.

STEPS:

  • Create a cost and usage report in AWS.
  • Create a Lambda function to unzip the zip files.
  • Trigger the Lambda function when reports fall into S3.
  • Schedule a task for moving the files from S3 to the Snowflake table.

PREREQUISITES:

  1. AWS account
  2. S3 bucket
  3. Snowflake account

STEP 1: WAY TO CREATE A COST AND USAGE REPORT IN AWS

The AWS Cost and Usage Reports (AWS CUR) contain the most comprehensive set of cost and usage data available. You can use the cost data to create reports. The cost and usage reports will be delivered to the specified S3 bucket based on the defined time granularity.

STEPS FOR CREATING A COST AND USAGE REPORT:

  • For Additional report details, select Include resource IDs to include the IDs of each individual resource in the report.
  • For Data refresh settings, select whether you want the AWS Cost and Usage Reports to refresh if AWS applies refunds, credits, or support fees to your account after finalizing your bill. When a report refreshes, a new report is uploaded to Amazon S3.
  • The next step is to configure the S3 bucket for storing the cost and usage reports.
  • Click on Configure. Either you can choose an existing bucket or create a new one.
  • The next step is to configure the delivery options.
  • A report path prefix must be specified. The cost and storage reports will be located in the folder with the name of the report path prefix inside the S3 bucket.
  • The time granularity may be set to monthly, daily, or hourly based on our choice, and the data will fall under the S3 bucket.
  • Report versioning: Creating a new report version—the data will fall into a separate subfolder when creating a new report version is specified.
  • Overwrite existing report — the data will get overwritten in the same report.
  • Enable Report data Integration: * (This is optional )
  • Compression Type: The compression type can be set to GZIP, ZIP, or Parquet, and click next. Review the details specified and edit if needed or give create the report.

STEP 2 & 3: RUN A LAMBDA FUNCTION WHEN NEW COST AND USAGE REPORTS ARE ADDED TO S3

The lambda function is used to unzip the fallen .gz file and store the data in the zip file in another file.

Creating a Lambda function

  • Choose an author from scratch to build a lambda function from beginning
  • Provide all the basic information, such as Function name, runtime
  • In runtime, choose python 3.9 and select Create function.
  • Then, open the created lambda function and add an S3 trigger to it.
  • Choose the S3 bucket you want to configure.
  • In event type, Choose all objects to create events
  • In prefix, include the folder path where the file falls into S3.
  • In suffix, include the extension of the falling file.
  • Click on acknowledge and click add.
  • Once this is done, if a file falls into the specified location in S3, Lambda will get triggered. Use this code in the code area.
import json
import boto3
import io
import zipfile
import gzip
import tempfile
import os.path
import shutil
import codecs
import datetime
import glob
import urllib
def lambda_handler(event, context):
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('bucket_name')
key = urllib.parse.unquote_plus(event['Records'][0]['s3']['object']['key'], encoding='utf-8')
bucket_name = event['Records'][0]['s3']['bucket']['name']
print(key,bucket_name)
s3_object = s3.Object(bucket_name,key)
print(s3_object)
filedata= s3_object.get()["Body"].read()
#print(filedata)
data_file_name=tempfile.gettempdir()+'/'+key.split('/')[3]
print(data_file_name)
File_csv=os.path.splitext(data_file_name)[0]
print(File_csv)
try:
s3.Bucket(bucket_name).download_file(key,data_file_name)
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
print("The object does not exist.")
else:
raise
with gzip.open(data_file_name,'rb') as f_in:
with open(File_csv, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
s3.Object('csv file path').put(Body=open(File_tsv, 'rb'))

When the cost and use reports in the S3 bucket are modified, the lambda code defined above is executed. When the files fall into the bucket, they will be in a zipped CSV format. The above lambda function will unzip the CSV file and store the file in the s3 bucket.

3. TASK SCHEDULING FOR MOVING THE S3 FILE DATA TO SNOWFLAKE :

We use task scheduling to set up the task with a procedure included in it.

Creating a Task :

CREATE OR REPLACE TASK COST_MONITORING_AUTOMATION_TSK
WAREHOUSE = warehouse_name
SCHEDULE = '120 MINUTE'
AS call aws_cost_usage_data();

aws_cost_usage_data() is a user-defined procedure that needs to be created to pull data from the S3 bucket and load it to the Snowflake table at regular intervals. In the above-mentioned code, the aws_cost_usage_data() procedure will be triggered once every 2 hours.

When this procedure is executed, the staged file is copied from S3 to Snowflake. The CSV file data is inserted into the Snowflake table using this procedure.

CONCLUSION:

This post has shown you how to export your AWS cost and usage data to Snowflake and use it to optimize your cloud spending and performance and can be used for reporting.

Think how awesome it would be to have your AWS cost and usage data always up to date and ready for analysis.

About Us

Bi3 has been recognized for being one of the fastest-growing companies in Australia. Our team has delivered substantial and complex projects for some of the largest organizations around the globe, and we’re quickly building a brand that is well-known for superior delivery.

Website: https://bi3technologies.com/

Follow us on,
LinkedIn: https://www.linkedin.com/company/bi3technologies
Instagram:
https://www.instagram.com/bi3technologies/
Twitter:
https://twitter.com/Bi3Technologies

--

--