🏦 Analyzing my monthly expenses with S3 triggers and AWS Lambda using AWS CDK
AWS S3 offers event notifications that can send messages to your destinations upon a specific event. You can use these for things like data hygiene, extra security measures or low-cost serverless processing.
Supported destinations for notifications include:
- Amazon Simple Notification Service (Amazon SNS) integrates it with virtually any significant system architecture.
- Amazon Simple Queue Service (Amazon SQS) queue is suitable if you expect a high volume of files and events.
- AWS Lambda is a good option when you don’t expect a lot of files, and you don’t need to de-couple the input and output process.
What are we building?
You’ll always want to use some form of Infrastructure as Code (IaC), and for that, my preference is AWS CDK. It’s just a straightforward way to deploy your resources, and its been growing consistently over the last couple of years. We will use the AWS Lambda destination directly without attaching SQS or SNS in between. Once finished, our Lambda function will print to CloudWatch.
Creating the infrastructure with CDK
First, we will need to initialise the AWS CDK project, define our resources, and open our IDE Visual Studio Code.
Next, we will define our resources in our main CDK stack. These include:
- The AWS S3 bucket where we will put the bank statements.
- The AWS Lambda function analyses the bank statements and identifies our spending in categories.
- The Event Source notifies the AWS Lambda function of any newly created object in our S3 bucket.
Analysing the statements with Python
If we now drop a file in our new S3 bucket, it will trigger an execution of our AWS Lambda function. Now, it’s time to write our code that takes a look at our monthly spending.
Firstly, we will want to download the file from S3 in our Lambda function to read it.
With access to the file, we can iterate over the contents to fetch the record and categorise it. For the demonstration of this post, we are doing nothing complex and have just created a set of categories with keywords that we want to match against the description of the records.
The Result
The result gives me a better picture of my monthly spending but, more importantly, how my AWS bill will not be a big part of it. Thanks to the Serverless components and the fact that I’m not doing any heavy lifting computational-wise, my bill will remain low.
You can imagine more use-cases for this solution, out of which popular ones are:
- Creating automatic thumbnails upon image drop
- Automatically moving high-res assets to cold storage while creating and keeping low-resolution variants
- Upon creation checking metadata and security aspects of a file before making it available in the rest of your application
Remember This
- Send events to SNS/SQS if you want to de-couple the process and allow for more scale. That way, you can increase the file per Lambda ratio and be more cost-effective.
- There is no guarantee for latency which makes it unreliable for live data use cases.
- Serverless means cost per invocation, which makes solutions like these very cost-effective. If you are not using it, you are not paying for it.
If you are looking for the complete project source, you can find the entire project in my repository here.
Are you looking for more?
We’re launching a pretty newsletter soon that will include weekly posts like these. You can find the newsletter and signup here.