AWS Cloud Security

Insider attack detection with CloudTrail

Active Security Monitoring

Taher Kapasi

Published in

Deloitte UK Engineering Blog

14 min readAug 10, 2023

The Scene

Your team has been tasked with building a secure solution using AWS services. The architects have followed engineering best-practice, ensured that the design is well-architected, and made a sensible attempt to mitigate against any security threats. The platform engineers have secured the VPC (Virtual Private Cloud) and used Security Groups, Network Access Control Lists & Network Firewalls. They’ve enforced server-side encryption on datastores, employed principle-of-least-privilege across all the utilised services with dedicated roles and policies, and protected infrastructure using resource policies. They have also utilised a Web Application Firewall and Shield because it is web-facing and the APIs are protected with API keys, token authorizers and so on. We have made a secure solution, congratulations, sit back and relax, your job is done.

One day, a disgruntled employee who has access to the AWS account, along with elevated privileges, accessed the account and began dumping data held within a table. They reconfigured the solution to expose it to other attackers and started destroying resources.

Once the organisation realises that a malicious attacker has compromised the solution and exposed the data, it is often too late. At this point, the only course of action is to initiate an inquiry and the CISO may be looking to hold someone accountable.

What went wrong?

Traditionally, security has mainly focused on external threats, however there is always the possibility of an insider threat. Insider threats are when the risk originates from within the organisation.

An Inside Attacker can be anyone that misuses credentials to, consciously or unconsciously, cause harm or expose data. It could be an employee, an associate, a contractor or anyone who has gained credentials that give privileged access to systems and/or data.

Is it worth the effort?

Organisations can leave themselves vulnerable to operational disruption if they do not have adequate processes and systems in place or are unprepared for specific events. Operational disruptions can have consequences such as reputational damage, financial loss, compliance failure, etc., and these risks should be managed.

Organisations may attempt to manage risks by balancing risk versus reward. Risk management processes aim to identify and assess risks, determine how to measure and mitigate them, and then figure out how to monitor and report occurrences.

The purpose of this article is to explain how to improve the security of an AWS environment by detecting inappropriate insider activity through CloudTrail and CloudWatch. Although this article focuses on AWS, this type of vulnerability applies to any solution, whether it is hosted in the cloud or on-premises. Therefore, it should be considered as part of an organisation’s operational risk management strategy.

What is CloudTrail?

AWS offers a managed service called CloudTrail, which maintains a history of activity across your AWS environment by creating audit logs. These logs provide visibility into account activity, such as when a user logs into the management console or when a security group’s configuration has been changed. The activity information contained in these logs can be useful for security analysis, change tracking, governance assessments, and meeting compliance and auditing requirements.

CloudTrail maintains an activity log that records information about who or what performed an action, when it was done and on what resource. It stores details such as the requested activity, the time of the request, the resource on which the operation is performed, the requesting user/role, and the outcome of the operation (whether it succeeded, was denied, or throttled). In essence, it keeps a history of every AWS API call made through the Management Console, AWS CLI tools, CloudFormation, or any of the AWS SDKs.

The following diagram illustrates the mechanisms that users utilise to interact with AWS services, how those activities are captured by CloudTrail and that events can be delivered to multiple Trails.

Mechanisms that users utilise to interact with AWS services, how those activities are captured by CloudTrail.

Trails

Each AWS account has a basic Trail enabled by default, which captures only the management events emitted within the past 90 days. However, this option does not persist events into S3 or CloudWatch logs.

To capture and store events beyond 90 days or to capture other types of events, the explicit creation of a Trail is required. A Trail can be configured in a single AWS account or within multiple accounts using AWS Organizations. Trails can be configured to capture events from multiple regions (as recommended by AWS) or a single region. Audit log files are saved into an S3 bucket, which does not have to reside in the same AWS account, as long as the Trail has permission to write objects.

As for the number of trails that can be created, only five trails can be created in a single region. A multi-region trail is considered an individual trail per region. Importantly, if you enable a multi-region trail, when AWS creates a new region, a new Trail will also be created automatically in that region and you will be charged for the events it audits.

A Trail can be configured to deliver events to a CloudWatch log group and trigger EventBridge in near-real time when the event occurs. If configured, a Trail can also send notifications when it writes an audit file to S3. Trails can be explicitly encrypted using a KMS CMK, implicitly encrypted using S3 SSE, or left unencrypted (not recommended).

Integrating CloudTrail with other AWS services

According to AWS, CloudTrail typically delivers log files within five minutes. If this delay provides too large a window for an attacker, it may be worth configuring the Trail to use EventBridge with rules that match specific event patterns. For example, a resource is being reconfigured, the event triggers EventBridge to call an endpoint that initiates remedial actions such as terminating all active sessions.

Management Events, Data Events & Insight Events

There are two types of audit events captured by a Trail; Management events and Data events. These are referred to as the Control Plane and the Data Plane respectively. When configuring a Trail either or both planes can be enabled.

Management Events occur when an account is accessed or when a resource is created, modified, configured or deleted. Control Plane Events can include a user attempting to log in to the management console where CloudTrail records whether the user was successful or failed to gain access, when creating and configuring an S3 bucket, or when deleting a DynamoDB table or a KMS key.

Data Plane events occur when calls are made to create, update, read, or delete data or when invoking an AWS resource. Examples of Data Plane events include putting items into a DynamoDB table, deleting a file from an S3 bucket or when invoking a lambda.

CloudTrail offers an additional event type, called Insight events, which provides information about unusual activity. Insight events are triggered by CloudTrail when it detects unexpected amount of API calls or an unusual rate of errors, either of these trends could indicate an insider attack is occurring. CloudTrail employs mathematical models that determine deviations from normal usage patterns or an abnormal number of errors. Once Insight Events are enabled, it takes a further 7 days of activity to build a normal usage pattern before insight events can be emitted.

The cost of CloudTrail is determined by the volume, variety, and veracity of events generated in your AWS account. It is important to keep in mind that enabling data events and insight events will result in more audited events, which could lead to an increase in CloudTrail costs. To optimise costs, one can use Event Selectors to refine the types of events being audited.

Event Selectors

By default, a Trail only captures events for significant AWS services such as IAM, CloudTrail and KMS. It ignores events from other services unless a Trail is configured to do so. Event Selectors come into action to capture audit events made to other AWS services. These selectors enable a Trail to check if an event meets a selector’s criteria, and if it does, it is logged to the appropriate trail; otherwise, it is ignored. CloudTrail checks each trail’s event selectors when an event occurs.

There are two types of event selectors available: Standard and Advanced. A Trail configured with a Standard selector can capture events from only three AWS services, namely S3, DynamoDB, and Lambda. On the other hand, Advanced Event Selectors can capture events from a wide range of other services.

A trail can only be configured to use either a standard event selector or an advanced selector, but not both simultaneously. While trails using standard event selectors can be created using CloudFormation or Terraform, trails using advanced selectors cannot be created using these tools now. It is important to note that applying an advanced selector to a Trail will result in the destruction of any existing standard event selector configuration, which will need to be redefined in the advanced selector configuration.

Creating an Advanced Selector Trail can be achieved using AWS CLI’s CloudTrail put-event-selectors command.

aws cloudtrail put-event-selectors --trail-name myTrail --advanced-event-selectors <field-selector-json>

Advanced Selectors are a useful tool in reducing your CloudTrail costs by allowing you to define fine-grained event selectors. This means that you can choose to capture only the important events, rather than all events. For instance, you may want to ignore read events, such as a DescribeKey operation on KMS, while being informed of write operations, such as when a KMS key is being deleted. By using Advanced Selectors, you can tailor your CloudTrail monitoring to your specific needs.

How to Begin

The points below are my good practice guidelines for getting started with CloudTrail.

Understand your legislative and compliance responsibilities as well as operational security requirements for the environment that you are deploying into to gauge which services should be audited.
Ensure you enable the appropriate Event Selectors for all the AWS services being utilised in your account, especially services that provide solution-critical resources.
Do not exclude any management events until you have had some time to capture normal activities and have confirmed that those events do not aid in identifying unusual activity. In most situations it is better and cheaper to perform excess auditing than to incur a security breach.
Ensure you have rules to highlight Root User access, unauthorised operations, deletion of KMS keys as well as any CloudTrail modifications because an attacker may try to disable auditing to evade detection.
Use EventBridge rules to trigger emergency remediation if the delay in delivering logs to CloudWatch and then triggering a metric alarm takes too long (i.e. does detection within 5 minutes-ish suffice?).
Automate the delivery of audit files and logs to external solutions outside of your account. Segmenting logs and audit files into external systems like Splunk, ELK, DataDog or another SIEM system is good practice.
Enable CloudTrail log file integrity so that audit files are digitally signed to ensure they have not been tampered with. You can verify the integrity of audit files using the AWS CLI.
To save costs, create a single region Ttrail if and only if you have a solution that resides within a single region. If you choose to do this, it is advisable to deny actions in other regions via IAM policies and role restrictions.
To further reduce CloudTrail expenditure, if there are multiple trails in the same region, ensure they are configured with mutually exclusive event selectors to reduce duplication of events in audit logs.
Use CloudFormation or Terraform to create a Trail using infrastructure-as-code along with S3 Bucket, Log Group, KMS keys and then have a post-deploy step that adds the Advanced Data selector configurations using AWS CLI.
Ensure access controls on the S3 bucket are restricted and enable versioning in case an attacker attempts to delete audit files.

Insider Attack Detection

The last section of this article explains how CloudTrail can be used to identify insider attacks by expanding its passive auditing capabilities into an active security monitoring tool. Active security monitoring can be achieved by using the CloudTrail Processing Library or through CloudWatch metrics and alarms, this article focuses on the latter.

Active security monitoring using CloudTrail & CloudWatch

Enable Auditing

The following CloudFormation resources create a Trail that is configured with a standard event selector to capture management and data events for Lambda & DynamoDB. The trail is also configured to store events in an S3 bucket and CloudWatch log group. Additionally, it has an SNS topic to emit notifications and a single KMS key to encrypt data in S3 and the logs. These resources provide a baseline CloudTrail deployment and serve as the foundation for auditing in AWS.

---
AWSTemplateFormatVersion: '2010-09-09'
Description: 'Template to create a CloudTrail with CloudWatch logging'
Resources: 
 
  TrailBucket:
    Type: AWS::S3::Bucket 
    Properties:
      BucketName: !Sub "myBucket"
      PublicAccessBlockConfiguration:
        BlockPublicAcls: true 
        BlockPublicPolicy: true 
        IgnorePublicAcls: true 
        RestrictPublicBuckets: true 
      BucketEncryption: 
        ServerSideEncryptionConfiguration: 
          - ServerSideEncryptionByDefault: 
              SSEAlgorithm: "aws:kms" 
              KMSMasterKeyID: !GetAtt StackKmsKey.Arn

  TrailBucketPolicy: 
    Type: AWS::S3::BucketPolicy 
    Properties: 
      Bucket: !Ref TrailBucket 
      PolicyDocument: 
        Version: "2012-10-17" 
        Statement: 
          - Effect: Allow 
            Action: "s3:GetBucketAcl" 
            Principal: 
              Service: "cloudtrail.amazonaws.com" 
            Resource: !Sub "arn:aws:s3:::${TrailBucket}" 
          - Effect: Allow 
            Action: "s3:PutObject" 
            Principal: 
              Service: "cloudtrail.amazonaws.com" 
            Resource: !Sub "arn:aws:s3:::${TrailBucket}/AWSLogs/${AWS::AccountId}/*" 
            Condition: 
              StringEquals: 
                "s3:x-amz-acl": "bucket-owner-full-control" 
          - Effect: Deny 
            Action: "s3:*" 
            Principal: "*" 
            Resource: 
              - !GetAtt TrailBucket.Arn 
              - !Sub "${TrailBucket.Arn}/*" 
            Condition: 
              Bool: 
                "aws:SecureTransport": false 

  TrailLogGroup: 
    Type: AWS::Logs::LogGroup 
    Properties: 
      LogGroupName: "/cloudtrail" 
      KmsKeyId: !GetAtt StackKmsKey.Arn 

  TrailLogGroupRole: 
    Type: AWS::IAM::Role 
    Properties: 
      AssumeRolePolicyDocument: 
        Version: "2012-10-17" 
        Statement: 
          - Effect: Allow 
            Action: "sts:AssumeRole" 
            Principal: 
              Service: 
                - "cloudtrail.amazonaws.com" 
      Policies: 
        - PolicyName: "myTrailLogGroupPolicy" 
          PolicyDocument: 
            Version: "2012-10-17" 
            Statement: 
              - Effect: Allow 
                Action: 
                  - "logs:CreateLogStream" 
                  - "logs:PutLogEvents" 
                Resource: !GetAtt TrailLogGroup.Arn 
              - Effect: Allow 
                Action: 
                  - "kms:GenerateDataKey" 
                  - "kms:Decrypt" 
                Resource: 
                  - !GetAtt StackKmsKey.Arn 

  Trail: 
    DependsOn: 
      - TrailBucketPolicy 
    Type: AWS::CloudTrail::Trail 
    Properties: 
      TrailName: "myTrail" 
      IncludeGlobalServiceEvents: true 
      IsLogging: true 
      IsMultiRegionTrail: false 
      EventSelectors: 
        - DataResources: 
          - Type: AWS::Lambda::Function 
            Values: 
              - "arn:aws:lambda" 
          - Type: AWS::DynamoDB::Table 
            Values: 
              - "arn:aws:dynamodb" 
          IncludeManagementEvents: true 
          ReadWriteType: All 
      KMSKeyId: !GetAtt StackKmsKey.Arn 
      S3BucketName: !Ref TrailBucket 
      EnableLogFileValidation: true 
      CloudWatchLogsLogGroupArn: !GetAtt TrailLogGroup.Arn 
      CloudWatchLogsRoleArn: !GetAtt TrailLogGroupRole.Arn 

  AlertTopic: 
    Type: AWS::SNS::Topic 
    Properties: 
      TopicName: !Sub "myAlertTopic" 
      KmsMasterKeyId: !Ref StackKmsKey 

  AlertTopicPolicy: 
    Type: AWS::SNS::TopicPolicy 
    Properties: 
      PolicyDocument: 
        Version: "2012-10-17" 
        Statement: 
          - Effect: Allow 
            Principal: 
              Service: 
                - "cloudwatch.amazonaws.com" 
            Resource: !Ref AlertTopic 
            Action: "sns:Publish" 
      Topics: 
        - !Ref AlertTopic

  StackKmsKey: 
    Type: AWS::KMS::Key 
    Properties: 
      EnableKeyRotation: true 
      KeyPolicy: 
        Version: "2012-10-17" 
        Statement: 
          - Effect: Allow 
            Principal: 
              AWS: !Sub "arn:aws:iam::${AWS::AccountId}:root" 
            Action: "kms:*" 
            Resource: "*" 
          - Effect: Allow 
            Principal: 
              Service: 
                - "sns.amazonaws.com" 
                - "cloudtrail.amazonaws.com" 
                - "cloudwatch.amazonaws.com" 
                - "s3.amazonaws.com" 
            Action: 
              - "kms:Encrypt*" 
              - "kms:Decrypt*" 
              - "kms:ReEncrypt*" 
              - "kms:GenerateDataKey*" 
              - "kms:Describe*" 
            Resource: "*" 
          - Effect: Allow 
            Principal: 
              Service: 
                - !Sub "logs.${AWS::Region}.amazonaws.com" 
            Action: 
              - "kms:Encrypt*" 
              - "kms:Decrypt*" 
              - "kms:ReEncrypt*" 
              - "kms:GenerateDataKey*" 
              - "kms:Describe*" 
            Resource: "*" 
            Condition: 
              ArnLike: 
                "kms:EncryptionContext:aws:logs:arn": !Sub "arn:aws:logs:${AWS::Region}:${AWS::AccountId}:*"

Account Activity

The AWS resources listed in the Enable Auditing section only allows for audit events to be captured. To transform this into an active monitoring solution, it is necessary to create event filters that can trigger notifications for timely actions. This can be achieved by creating CloudWatch Metric Filters that can activate Alarms. These Alarms can then emit notifications through SNS.

  UnauthorizedOperationMetricFilter: 
    Type: AWS::Logs::MetricFilter 
    Properties: 
      FilterPattern: '{$.errorCode = "*UnauthorizedOperation*" || $.errorCode = "*AccessDenied*"}' 
      LogGroupName: !Ref TrailLogGroup 
      MetricTransformations: 
        - MetricValue: 1 
          MetricNamespace: MyNamespace
          MetricName: UnauthorisedOperation 

  UnauthorizedOperationAlarm: 
    Type: AWS::CloudWatch::Alarm 
    Properties: 
      AlarmDescription: "Unauthorised API calls detected" 
      Namespace: MyNamespace 
      MetricName: UnauthorisedOperation 
      Statistic: Sum 
      Period: 60 
      EvaluationPeriods: 1 
      ComparisonOperator: GreaterThanThreshold 
      Threshold: 0 
      AlarmActions: 
        - !Ref AlertTopic 
      TreatMissingData: notBreaching

The CloudFormation resource mentioned above is used to create an alarm that gets triggered whenever an AWS service call results in access denial or unauthorised access. Receiving notifications of this type may indicate that someone or something has attempted to perform an action without proper permissions or is trying to engage in malicious activity but has not been successful yet.

To facilitate detection of insider threat in any account, it is advisable to have event filters that capture the following types of events.

There are many other filters that can be created to protect your AWS resources and the AWS event reference describes the most useful filter attributes. The OOTB Rules listed by DataDog also provide a valuable catalogue of what can be monitored as well as show the breadth of activities that you could be notified about.

Solution Activity

The Account Activity filters are essential for reporting general misuse and detecting unusual activity. However, in most cases, organisations deploy additional solution stacks that provide unique features. These solutions create additional resources that should only be accessed by known resources or roles. Therefore, it is crucial to ensure that these resources are used appropriately.

Let’s take an example of a scenario where you need to ensure that only specific operations are performed by a particular resource, such as allowing a KMS key’s decrypt operation to be used by a specific lambda function only. This is crucial to prevent misuse of the cryptographic material. Although granting limited access to the key operation through the lambda’s role is required, it may not be sufficient as other IAM roles with “kms:*” permission, like the root user, could still use the decrypt operation unless the key’s policy explicitly denies access.

In some cases, it may not be possible to protect a resource through a resource policy, such as a DynamoDB table. In such cases, monitoring the usage of the resource becomes critical. For instance, a malicious user has DynamoDB permissions, enabling them to modify or change data without detection. By monitoring table usage, you can identify any unauthorised access and take appropriate action.

To enhance visibility and ensure protection, we create event filters that are specific to each resource. This is achieved by utilising the existing CloudTrail log group, which is created during the Enable Auditing process, and adding new metric filters and alarms to it using the ARNs of the resources that are created.

  MyLambdaMisuseMetricFilter: 
    Type: AWS::Logs::MetricFilter 
    Properties: 
      FilterPattern: 
        Fn::Join: 
          - "" 
          - - '{$.eventSource = lambda.amazonaws.com && $.requestParameters.functionName = "' 
            - !GetAtt MyLambda.Arn 
            - '" && (($.eventName = Invoke && $.userIdentity.arn != "arn:aws:sts::' 
            - !Sub "${AWS::AccountId}:assumed-role/${MyStepFunctionRole}/step-functions-express-*" 
            - '") || $.eventName != Invoke)}' 
      LogGroupName: "/cloudtrail" 
      MetricTransformations: 
        - MetricValue: 1 
          MetricNamespace: MyNamespace 
          MetricName: MyLambdaIsBeingMisused 
          Unit: Count

The above example filters events where MyLambda is not invoked by the expected IAM role assigned to MyStepFunction or when any other operation is performed on MyLambda. This event filter could be used to ensure you are aware when a malicious user has invoked a lambda via the console.

The following example refers to a DynamoDB table named SomeImportantDynamoDBTable where the Query operation is expected to be exclusively performed by MyStepReadingFunction’s role and MyLambda. PutItem is only allowed by MyStepWritingFunction, and MyLambda is the only resource allowed to call DeleteItem. Any other operations on this DynamoDB table are unexpected and hence trigger a misuse alarm, which sends out a notification.

  TableMisuseMetricFilter: 
    Type: AWS::Logs::MetricFilter 
    Properties: 
      FilterPattern: 
        Fn::Join: 
          - "" 
          - - '{$.eventSource = dynamodb.amazonaws.com && $.requestParameters.tableName = "' 
            - !Sub "SomeImportantDynamoDBTable" 
            - '" && ((($.eventName = Query && ($.userIdentity.arn != "arn:aws:sts::' 
            - !Sub "${AWS::AccountId}:assumed-role/${MyStepReadingFunction}/step-functions-express-*" 
            - '" && $.userIdentity.arn != "arn:aws:sts::' 
            - !Sub "${AWS::AccountId}:assumed-role/${MyLambdaRole}/${MyLambda}" 
            - '")) || ($.eventName = PutItem && $.userIdentity.arn != "arn:aws:sts::' 
            - !Sub "${AWS::AccountId}:assumed-role/${MyStepWritingFunction}/step-functions-express-*" 
            - '") || ($.eventName = DeleteItem && $.userIdentity.arn != "arn:aws:sts::' 
            - !Sub "${AWS::AccountId}:assumed-role/${MyLambdaRole}/${MyLambda}" 
            - '")) || ($.eventName != Query && $.eventName != PutItem && $.eventName != DeleteItem))}' 
      LogGroupName: "/cloudtrail" 
      MetricTransformations: 
        - MetricValue: 1 
          MetricNamespace: MyNamespace 
          MetricName: MyTableIsBeingMisused 
          Unit: Count

Summary

This article describes the benefits of CloudTrail and explores how to effectively monitor the security of your AWS account. The key focus is how CloudTrail can be configured to detect users conducting unexpected activities and therefore aid in protecting against insider attacks.

Although CloudTrail has additional features, such as CloudTrail Lake and integration with EventBridge, the key takeaway is that it should be incorporated as a core deterrent layer in your security strategy. Specifically, it should be used to safeguard your solution’s assets and should be utilised alongside SIEM as part of a comprehensive security architecture.