Reducing RDS Audit Logs Cost by Over 90 Percent

Fabian Stehle
Inside Personio
Published in
5 min readApr 29, 2021

Persistent storage can be expensive — that’s no surprise. Today we’ll dive into our cost-saving solution, outlining how we reduced persistent storage costs by over 90 percent.

At Personio, one of our requirements is that all executed database queries be logged for auditing purposes. We use Amazon Relational Database Service (RDS), where audit logs can usually be enabled with MariaDB Audit Plugin when using MySQL or MariaDB. These logs are then stored on the RDS instance directly and rotated according to the defined log file size (SERVER_AUDIT_FILE_ROTATE_SIZE and SERVER_AUDIT_FILE_ROTATIONS).

The log files can quickly fill up with executed queries, at which point they need to be rotated rapidly. This means the log files on the instance are deleted after a short time period. However, we still need access to these files — which means turning to persistent storage.

Using CloudWatch is Expensive

One option for persistent logs storage is CloudWatch, popular because it can be easily integrated with RDS (among other reasons). Some of the integration options include:

Of course, using CloudWatch Logs comes with a price. For the Frankfurt region (eu-central-1), the pricing structure is currently, at time of publication:

  • Collect (Data Ingestion): $0.63 per GB
  • Store (Archival): $0.0324 per GB

If you do the math, one TB of logs equals a data ingestion cost of about $645 per month and storage costs of about $33, if the retention period is set to one month.

This is a total cost of roughly $678 per month, which obviously gets even more expensive the more log data you have stored on RDS instances.

Let’s Try S3

A cheaper alternative to CloudWatch for persisting log data is Amazon Simple Storage Service (S3). AWS’s object storage service offers industry-leading scalability, data availability, security, and performance — all while maintaining rather low ingestion and storage costs.

S3 allows us to set up lifecycle rules in an S3 bucket, deleting logs after the retention period is over. In order to query the stored data we are also using Amazon Athena, a serverless, interactive service that queries and analyzes big data in S3 using standard SQL.

Amazon S3’s pricing service for the Frankfurt region is currently:

  • Data Transfer from Internet, into S3: $0.00 per GB
  • PUT, COPY, POST, LIST requests: $0.0054 per 1,000 requests
  • Storage: $0.0245 per GB (for the first 50 TB per month)

Because the data transfer is free, we only need to pay for the PUT requests — about $5 per one million requests (at roughly 1 MB of data per request). If we maintain a data retention period of one month, our storage costs are about $25.

This $30 total monthly cost is a significant decrease from CloudWatch’s $678–96 percent significant, in fact.

Finding a Solution

With that type of cost savings it only made sense to make it work, so we implemented an AWS Lambda function to pull the logs from RDS and push them to S3. The function is called periodically by a rule in CloudWatch Events / EventBridge to process all newly-rotated log files. This ensures that the log files are complete, and not modified, when the function starts processing them.

The function then writes all the log files to an S3 bucket, which has a specific structure that enables Athena to easily and efficiently search for entries. As we know, most search queries are based on a time range. Because of this we opted to group individual entries into files organized under a year/month/day/hour path. Now, if we want to search the log archive, we create an Athena table using partitions with the following structure:

CREATE EXTERNAL TABLE `my_table` (
`timestamp` string,
`serverhost` string,

)
PARTITIONED BY (
`year` int,
`month` int,
`day` int,
`hour` int
)

LOCATION ‘s3://my-bucket/audit-logs’;

We store the timestamp of the latest process log file in a DynamoDB table in order to have a checkpoint for the next invocation of the Lambda function.

Overall, the flow looks like this:

The basic Lambda function algorithm is:

* Get timestamp of latest processed log file from DynamoDB
* While true:
- Use timestamp to get next audit log file
- Abort if no log file has been found or log file is not rotated
- Download log data of log file
- Check if log file has been rotated in the meantime and retry if that is the case
- Parse all log data for log file
- Write log data to S3 using the timestamp and the date as part of the key (Athena compatible format)
- Save timestamp of the latest processed log file in DynamoDB

With this in place we implemented the solution with AWS Serverless Application Model (SAM), publishing it to the Serverless Application Repository (SAR). This allowed us to easily distribute the application within multiple AWS accounts, and for other AWS users to use it on their RDS instances. The code is now open source on Github at https://github.com/personio/rds-audit-logs-s3.

At Personio, we use Terraform to set up infrastructure as code. Unfortunately, Terraform does not allow users to deploy an application from the SAR. As a workaround we used CloudFormation, which does allow the user to deploy from the SAR. In this case we just used an additional CloudFormation stack to deploy our application. Here’s a quick look at the stack we use:

# cf_template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Parameters:
Name:
Type: String
BucketName:
Type: String
RdsInstanceIdentifier:
Type: String
SarApplication:
Type: String
SarApplicationVersion:
Type: String
Resources:
RdsAuditLogsS3Application:
Type: AWS::Serverless::Application
Properties:
Location:
ApplicationId: !Ref SarApplication
SemanticVersion: !Ref SarApplicationVersion
Parameters:
Name: !Ref Name
BucketName: !Ref BucketName
RdsInstanceIdentifier: !Ref RdsInstanceIdentifier
TimeoutInMinutes: 5
# cf_template.yaml
AWSTemplateFormatVersion: '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Parameters:
Name:
Type: String
BucketName:
Type: String
RdsInstanceIdentifier:
Type: String
SarApplication:
Type: String
SarApplicationVersion:
Type: String
Resources:
RdsAuditLogsS3Application:
Type: AWS::Serverless::Application
Properties:
Location:
ApplicationId: !Ref SarApplication
SemanticVersion: !Ref SarApplicationVersion
Parameters:
Name: !Ref Name
BucketName: !Ref BucketName
RdsInstanceIdentifier: !Ref RdsInstanceIdentifier
TimeoutInMinutes: 5

The CloudFormation stack itself is deployed with Terraform:

locals {
sar_application = "arn:aws:serverlessrepo:eu-central-1:640663510286:applications/rds-audit-logs-s3"
sar_application_version = "0.0.2"
rds_instance_identifier = "mydb"
}
resource "aws_cloudformation_stack" "rds-audit-logs" {
name = "rds-audit-logs-${local.rds_instance_identifier}"
template_body = file("${path.module}/cf_template.yaml")parameters = {
Name = "rds-audit-logs-${local.rds_instance_identifier}"
BucketName = aws_s3_bucket.rds_audit_logs.id
RdsInstanceIdentifier = local.rds_instance_identifier
SarApplication = local.sar_application
SarApplicationVersion = local.sar_application_version
}
capabilities = ["CAPABILITY_AUTO_EXPAND", "CAPABILITY_IAM"]
}
resource "aws_s3_bucket" "rds_audit_logs" {
bucket = "rds-audit-logs"
acl = "private"
}

In this way we can easily deploy our solution to all needed AWS accounts.

A Stable Result

Success! We implemented this solution at Personio over six months ago, where it has proved a stable and effective cost-savings method.

Are these the types of problems you want to solve? Take a look at our open Engineering positions and find the right fit for you!

--

--

Fabian Stehle
Inside Personio

Freelance AWS developer & consultant, technology junkie, passionate about Serverless, AWS & Containers