Become A Detective With AWS CloudTrail
Learn to read the clues and resolve the mystery of who murdered the EventBridge rule!
One day at the Payment Service team, we had a problem with the execution of a Step Function. This state machine was triggered by an EventBridge rule every 45 minutes. To our surprise, this rule had just disappeared.
We did not know if it was deleted manually by one of our developers or caused due to an infrastructure change that we made through the code.
AWS CloudTrail to the rescue!
I remembered reading about AWS CloudTrail a long time ago and how it recorded the activities that happened within an AWS Account. So I went straight to the AWS Console and started my AWS CloudTrail journey.
What Is AWS CloudTrail?
With AWS CloudTrail, we can monitor all the actions that happen inside an AWS account, independently of how they have been triggered — via AWS CLI, AWS Console, AWS SDKs, etc.
AWS CloudTrail helps prove compliance, improve security, enable auditing, and troubleshoot operational issues.
The actions are reflected on AWS CloudTrail as events.
There are currently three types of events — Manage events, Data events, and Insight events.
Manage events
These events are for any managed operations done in the account. For example,
- List DynamoDB tables
- Delete logs stream from CloudWatch
- Deletion of a bucket replication in S3
Data events
These are for the operations performed on or within a resource. For example,
- Delete Object from a S3 bucket
- Put item in a DynamoDB table
- Execute a Lambda function
Insight events
These events are for detecting unusual operational activity within your AWS account. For example,
- An unusual number of invocations of a lambda function
How Does It Work?
You can see all the activity that happened within the last 90 days through the AWS Console. If you want to store the events further, you can set up a trail and send the events to an S3 bucket.
On the AWS Console, you can only filter based on some limited attributes of the events. Hence, AWS recommends to setup your trail to send the events to CloudWatch.
With CloudWatch Log Insights you can filter the events that you are looking for using complex queries.
Debugging Events With AWS CloudTrail
When I looked at the CloudTrail dashboard, I was astonished by the number of events that were recorded within a short period of time. I could see every activity happening in the account by whom, what, how, where, and when.
As we had a ton of events recorded, my first approach was to narrow the events based on the time we updated our resources (we use infrastructure as code and GitHub Actions to update them).
This did not help as the list of events was still massive even after narrowing it down to a few minutes.
It was a tedious process going page by page. It became necessary to narrow it further and using other filters.
On AWS CloudTrail with AWS Console, you can currently filter for AWS Access key, Event ID, Event Name, Event Source, Read Only, Resource Name, Resource Type, and Username.
By filtering with Event Name and searching by the DeleteEventRule, we were able to see that the event was deleted by Developer_1.
I could see every action happening within AWS and know who did what, how, where, and when .
Days later, at the trial Developer_1 was being judged for the deletion of the EventBridge rule.
You can hear the people scream:
Guilty!! To the jail, the developer murdered the innocent Event Rule intentionally!!!
Can AWS CloudTrail help the defendants prove that the developer deleted the EventRule by mistake instead of as a malicious act?
Based on the previous event, the defendants presented their case:
- Using the user_agent field the defendants could demonstrate that the action was done through the AWS Console, which as a user interface could be prone to errors unlike the execution of a script.
- After and before the deletion of the main rule there was the deletion of other rules, using the request_parameters we know that the deleted rules belonged to the environment of Developer_1. Developer_1 could have deleted by mistake the main rule, thinking that it belonged to their user environment.
After a few minutes, after presenting all the evidence and listening to both parts, the jury determined that Developer_1 was indeed guilty of murdering the EventRule. The sentence was reduced as it was done by an error and not intentionally. Developer_1 was sentenced to redeploy the stack and everything went back to normal in the Payment Service team.
Final Thoughts
AWS CloudTrail can help you troubleshoot operational errors, it’s important that you become familiarized with it as it will improve the security of your system. This article is just a tiny piece of what you can achieve with AWS Cloudtrail, you can also respond to specific actions if an event has occurred, you can use it with IAM Access Analyzer to discover permissions used by your resources, meet compliance requirements, etc.
I wish you Godspeed in your AWS Cloudtrail journey!
Thanks to Nicole Yip and Sheen Brisals for their encouragement and support.
Recommended resources:
- AWS CloudTrail user guide
- How AWS CloudTrail works
- AWS CloudTrail tutorial
- Advanced filtering -Analyze data with Cloudwatch logs insights
- Advanced filtering -Use SQL queries with Amazon Athena
- Using AWS CloudTrail to Enhance Governance and Compliance
- How to use AWS CloudTrail to automatically Revert and Receive Notifications About Changes to Your Amazon VPC Security Groups.