On Amazon EKS and Security

Dirk Michel
15 min readJan 2, 2023

--

Securing Kubernetes is a broad undertaking as the cloud-native ecosystem keeps evolving and new threats emerge. With Amazon EKS, we can offload some of the security work to AWS, as the Amazon EKS service team secures and manages the Kubernetes Control Plane components on our behalf: The AWS shared responsibility model has AWS cover the security of the cloud, including its managed services, and we own the security in the cloud.

Securing Kubernetes in the cloud will generally include having an approach to (a) software supply chain security and vulnerability management; (b) provisioning Amazon EKS clusters and infrastructure that follow security best practices such as the Security Pillar of the AWS Well-Architected Framework, the Container Build Lens, and Amazon EKS Best Practices for Security; (c) selecting admission controller and runtime security add-ons; and (d) making a range of choices around threat detection, investigation, and incident response systems.

Application, Platform and Security teams need to cover many areas… Kubernetes applications don’t run in isolation and often interact with other adjacent AWS resources, such as storage and database services, all of which would also be within the security perimeter.

In addition to Amazon EKS, we can often look to AWS-managed security services to further offload undifferentiated heavy lifting. This blog looks at how AWS-managed security services can help protect Amazon EKS environments and their supporting AWS resources whilst being specific enough for Kubernetes environments.

The portfolio of AWS native security services increasingly incorporates container and Kubernetes-specific features. From the vantage point of Amazon EKS, AWS native options have growing relevance for securing Kubernetes environments.

The following diagram illustrates the scope of reference, highlights the covered security services, and provides an Amazon EKS architecture for context.

Reference view across AWS native security services and Amazon EKS

As a side note: Supply chain security and improving application container security on AWS are covered by a separate post.

For those on a tight time budget: The TL;DR of the following sections is to show how AWS native security services increasingly incorporate relevant Kubernetes features that help secure Amazon EKS clusters and their surrounding services. A range of AWS native security services contribute to this objective, and I’ll group them somewhat loosely into identity and access management, data protection, logging and detective services, infrastructure security, and incident response.

Let’s do it.

1. Identity and access management

Access to Amazon EKS clusters is secured by both AWS Identity and Access Management (IAM) and Kubernetes RBAC.

AWS IAM: AWS IAM for Amazon EKS authenticates IAM identities and grants access to the AWS EKS API as defined by IAM permission policies. Apart from the IAM user or role that created the Amazon EKS cluster, other IAM identities don’t have permission to interact with Amazon EKS resources by default: Any IAM role or user identity must authenticate with AWS IAM and have the relevant EKS policy permissions prior to connecting to the AWS EKS API or the Amazon EKS-managed Kubernetes API server.

A useful pattern to manage workforce identities is through a centralised Identity Provider and integrating it with AWS IAM Identity Center (successor to AWS Single Sign-On) for authentication. Workforce identities are best federated through a single Identity Provider. For example, use MS Active Directory with ADFS, which is integrated with AWS Identity Center via AWS AD Connector and SAML2. Hence, we manage workforce identities in MS Active Directory, federate them into AWS IAM roles, and assign policies that define access permissions to AWS APIs.

Kubernetes RBAC: The Amazon EKS Kubernetes API server accepts incoming requests from IAM authenticated identities with the eks:AccessKubernetesApi permission. Once received, the Kubernetes RBAC configuration authorises access to Kubernetes API resources, determining what the IAM entity can do within the cluster. To associate AWS entities with Kubernetes RBAC permissions, each IAM role or user entity is mapped to their corresponding Kubernetes RBAC RoleBinging or ClusterRoleBinding resources through the aws-auth config map. The RBAC resources, such as Roles and ClusterRoles, then contain the details of which Kubernetes APIs can be accessed by the entity and which actions can be performed. Anonymous access to Amazon EKS API server endpoints is disabled by default.

Equally, Kubernetes workloads themselves and their ServiceAccounts can be authorised to access other AWS APIs. Fine-grain AWS API access for Pods can be configured through IAM Roles for Service Accounts (IRSA).

2. Data protection

AWS native security services can protect data handled by Amazon EKS, including data at rest and in transit.

A foundational concern for data protection is encryption key management, as we need to securely generate, store, and manage the use of encryption keys. AWS Key Management Service (KMS) can create and control keys to encrypt or digitally sign data. AWS KMS enables central key management and provides policy options across many AWS services. AWS KMS keys (KMS keys) are the primary resource in AWS KMS, and we distinguish between keys that the customer manages (customer-managed key), keys that AWS manages on our behalf (AWS-managed key), and AWS-owned keys (AWS owned key).

Protecting data at rest: AWS KMS customer-managed keys are a great option to protect AWS native storage services that Kubernetes applications can use. AWS KMS customer-managed keys provide an audit trail via AWS CloudTrail logs and enable sharing of encrypted data volumes across AWS Regions and Accounts. This KMS Key type is broadly supported for encrypting data volumes and file systems at rest, including for Amazon EBS, Amazon EFS, Amazon RDS, and AWS Backup Vaults. Kubernetes applications can then transparently mount the protected volumes with CSI plugins such as the Amazon EFS CSI Driver and Amazon EBS CSI Driver. Defining AWS KMS Key Policies that control access to KMS Keys through Key Policy Conditions such as “viaService“ can help improve least-privilege configuration goals.

AWS-Managed keys help protect data stored by AWS-managed and serverless services which are commonly leveraged by Kubernetes applications, such as Amazon S3, Amazon SNS, Amazon SQS, Amazon Kinesis DataStreams, and AWS Lambda.

Protecting secrets: The Amazon EKS service encrypts control plane storage volumes at disk-level with AWS-managed Keys. When declared at cluster creation time, Amazon EKS can also use AWS KMS customer-managed keys to provide envelope encryption of Kubernetes Secrets that are persisted within etcd. Otherwise, Kubernetes Secrets are base64 encoded but not encrypted.

Additionally, Amazon EKS workloads can reference and mount secrets that are held externally. AWS Secrets Manager is the AWS native managed secrets store service that uses KMS keys, rotates, and audits secrets such as RDS database credentials and API keys, enabling machine entities and applications to reference and access centralised and protected secrets. The Kubernetes Secrets Store CSI Driver cluster add-on can be used to help achieve that.

Protecting data in transit: Protecting data transiting in and out of Kubernetes clusters is mainly achieved through Transport Layer Security (TLS) Certificates that can be generated or imported into the AWS native AWS Certificate Manager (ACM) service. ACM stores TLS certificates and their corresponding private key that comply with defined characteristics and uses AWS KMS to help protect the private key.

The certificates can then be associated with AWS Elastic Load Balancers that direct incoming traffic to their corresponding Kubernetes Service Endpoints or Pods. An excellent option for provisioning and configuring Application Load Balancers (ALB) and Network Load Balancers (NLB) through the AWS Load Balancer Controller for Kubernetes (AWS LBC). The ALB can terminate incoming HTTPS connections through HTTPS Listeners, which helps off-load the encryption handling from the worker nodes. TCP-Passthrough configuration with NLB can be used when end-to-end encryption or mutual TLS is required between the requester and the serving application.

The ACM certificates also play a role when Kubernetes applications use database clients that are connecting to TLS-protected database endpoints, with the different AWS-managed database services and engines having their own processes for implementing TLS.

Protecting data transiting within Kubernetes clusters can be secured by ACM certificates as well. Service Mesh options such as AWS AppMesh are integrated with ACM and can encrypt “east-west” communications with the help of side-car proxies.

3. Logging and detective services

From a security point of view, log sources capture various types of activity that ultimately produce the data set foundation on which both detective and preventive security systems apply various techniques to extract actionable insights.

Logging Sources: “Cloud plane” log sources include AWS CloudTrail Logs, Amazon VPC Flow Logs, AWS Route 53 DNS query logs, and access logs from AWS Elastic Load Balancer such as AWS ALB Access Logs and AWS NLB Access Logs. AWS Systems Manager Session Manager Logs help log connections to Kubernetes worker nodes when opting for Sessions Manager instead of SSH. “Kubernetes plane” log sources are Kubernetes API server audit logs generated by Amazon EKS.

A central logging account within an AWS Control Tower Landing Zone pattern helps consolidate, persist, and manage the lifecycle translations of the log files that originate from across an AWS Organisation. The long-term persistence requirements and data ownership aspects strongly motivate security teams. Accumulating security “raw data” in this way is also relevant when hybrid or non-cloud generated security data is part of the data set.

Options such as the AWS Security Lake service can help simplify the set-up, integration and ingestion of security-relevant data sources into a data lake structure and help adopt the Open Cybersecurity Schema Framework (OCSF) data schema. The OCFS initiative promises to establish a vendor-agnostic data schema that can be more readily interrogated, correlated, and analysed by security-analytics tools independent Security Information and Event Management (SIEM) systems. This option aims at organisations that need to work with their own SIEM systems and define their own threat-hunting queries across massive disjointed data sets.

Whilst specific requirements may compel many organisations to create and maintain long-term security log data stores, the ecosystem of AWS native security services doesn’t depend on them.

Inventory and Configuration Management: Keeping an active resource inventory and tracking configuration history as a data source for uncovering security signals is another aspect of securing a Kubernetes environment on AWS. The AWS Config service can record AWS resource configuration state changes and detect deviations by applying AWS-managed or custom rules. AWS Config rules represent desired configuration settings: AWS Config evaluates the rules at configurable intervals or event-based triggers and determines whether resource configurations comply with defined rules, summarises compliance results, and integrates with Amazon EventBridge to open up further possibilities of implementing automated actions.

AWS Config introduces the notion of Conformance Packs, which are templated YAML files containing collections of AWS Config rules and remediation actions that can then be deployed as a single entity across an AWS Organisation. AWS provides Conformance Packs, such as Security Best Practices for Amazon Elastic Kubernetes Service (Amazon EKS), which can be used as-is or further customised and modified. Other Conformance Packs can also help demonstrate “compliance over time” for supporting resources: Operational Best Practices for Amazon S3 and Operational Best Practices for Amazon EFS.

Threat Detection: An AWS native option for threat detection is Amazon GuardDuty, which principally combines intelligent analysis of log file data sources with threat intelligence feeds and policy-based definitions to generate findings. AWS GuardDuty creates activity baselines and employs anomaly detection based on machine learning to differentiate further its supported finding types, detection accuracy and relevance. Findings are the primary currency unit of threat detection.

As mentioned, Amazon GuardDuty does not depend on us separately activating and writing out log data sets to Amazon S3 - the service directly accesses and handles its supported data sources automatically.

Amazon GuardDuty supports additional data sources such as Kubernetes Audit Logs generated by Amazon EKS clusters, which play a role in detecting threats against the Kubernetes API. The Amazon GuardDuty team works on a growing list of supported Kubernetes audit logs finding types aligned with the MITRE ATT&CK framework and categorises them by Policy, Malicious Access, and Suspicious Behaviour. The EKS Protection finding types are then further decorated with recommended analysis and remediation guidance.

Enabling the optional Amazon GuardDuty Malware Protection feature can generate findings for suspicious behaviour indicative of malware on worker nodes or container workloads and initiate Amazon EBS-based scans. In cases where Kubernetes applications access Amazon RDS database services, then extending the threat detection with Amazon GuardDuty RDS Protection can play a role in identifying potentially suspicious database login behaviour. The use of Container runtime threat detection for Amazon GuardDuty is another option that the Amazon GuardDuty team is working on, which opens up instrumenting the runtime stream of Linux system calls and asserting them against threat detection rules. Detecting anomalous behaviour of workloads that have successfully traversed CICD pipeline security scans and admission controllers can be important to identifying supply chain attacks.

Findings generated by Amazon GuardDuty and its optional features can be integrated with an AWS native option for supplementary threat investigation.

Threat Analysis: The AWS native option for aided threat triage and investigation is Amazon Detective. Amazon Detective aims to assemble and intelligently link security-relevant log sources and findings to enable faster and more efficient security investigations.

Amazon Detective builds upon AWS GuardDuty findings and consumes additional data sources to populate a behaviour graph, one of its key concepts. The behaviour graph represents a linked set of data assembled from supported data sources and is generated through machine learning, statistical analysis, and graph theory. That linked data set or graph is then further decorated with additional contextual information and provided with supplementary visual aids.

One of these sources can be Amazon EKS API audit logs, which introduce new Amazon Detective resource types such as EKS Cluster, Kubernetes Pod, Container Image, and Kubernetes Subject, against which enhanced information is assembled. Productivity-enhancing features such as Amazon Detective Finding Groups are another aspect that accelerates the investigation experience for security engineers and helps examine multiple activities as they relate to a single security event. Amazon Detective investigation tabs containing Amazon EKS-specific activity information - the Kubernetes Activity tab and Kubernetes API calls tab - also contribute to increasing investigation and threat-hunting speeds.

Posture Management: AWS Security Hub aims at offering a central dashboarding and aggregation point for security findings generated by supported AWS services such as Amazon GuardDuty, AWS Config, AWS Inspector, and Amazon Macie, which can help reduce the need to swivel-chair into individual AWS services and accounts in the hope of assembling an integrated security view and prioritising effectively. Reducing or avoiding an increase in operational overhead and data collection/preparation work is a perennial concern for many.

Acting as a hub for security findings, AWS Security Hub can be effective as a point of distribution, as an alternative to operating on and consuming findings produced by other individual AWS services. AWS Security Hub also automatically emits findings into Amazon EventBridge, which opens up a springboard for further automation options.

From a Kubernetes perspective, AWS Security Hub’s integration with AWS Inspector and AWS ECR aggregates container image security scan results. Security Hub can also consume findings from Amazon Macie, which can be a relevant security source for Kubernetes applications that interact with S3 data stores.

Interestingly, AWS Security Hub also provides an option for evaluating compliance against a growing set of security industry standards and best practices. AWS Security Hub Security Standards build upon and leverage AWS Config by defining and inserting its own additional set of AWS Config rules to arrive at a compliance determination against security standards such as AWS CSI Foundations Benchmark.

Scaling: AWS native security services can help security teams absorb impacts created by a growing number of Amazon EKS clusters and supporting resources.

“Cloud plane” logging sources automatically scale with the AWS footprint. When adding new VPCs, for example, they are automatically enabled for VPC FlowLogs. DNS query logs from new Route53 hosted zones are automatically consumed. Kubernetes API audit logs from newly created clusters are equally used without the need to intervene.

AWS native security services are integrated with AWS Organisations and implement the Delegated Administrator (DA) concept: The Delegated Administrator integrations help simplify the management and configuration of AWS services across multiple accounts within an AWS Organisation. Amazon Inspector DA, Amazon GuardDuty DA, Amazon Detective DA, and AWS Security Hub DA support this, which removes the need to manage these services individually by AWS account.

4. Infrastructure security

The term infrastructure security can cover a range of security aspects that have close proximity to resources such as servers, operating systems, networking, firewalls, and traffic inspection. A range of AWS native and managed services options can help advance infrastructure security objectives for Kubernetes-centric stacks.

Worker nodes: With Amazon VPC, we can launch AWS resources into a virtual network that we define and adopt a set of security features that are directly built into the Amazon VPC service. Foundational security features that help protect Amazon EC2-based Kubernetes worker nodes are VPC security groups (SG) and network access control lists (NACL).

Opting for container-specialised operating systems such as Bottlerocket OS contributes to reducing the attack surface of worker nodes without incurring the penalty of having to harden and secure general-purpose operating systems over time.

Some Kubernetes applications or micro-services can also run AWS Fargate when they fit into a set of prerequisites and Pod configuration requirements. The AWS Fargate compute option reduces the attack surface by way of running Pods on dedicated non-shared micro-VMs, doesn’t allow root containers nor privilege escalation, and permits a narrow list of Linux Capabilities.

Networking: A primary networking concern for Amazon EKS relates to the Kubernetes API endpoint itself. The Kubernetes API server should be a closely guarded, “inside the fence” service. VPC networking can help secure access to the Kubernetes management APIs: There are several networking options, but unless there is a particular reason, opting for private networking prevents access to the Amazon EKS Kubernetes API server from publicly routed IP addresses. The requester-managed VPC Interface Endpoint Elastic Network Interface (ENI) that Amazon EKS adds to the VPC will only have a private IP assigned. Another option is to use the AWS PrivateLink Interface Endpoint for Amazon EKS.

Kubernetes applications themselves often access other supporting AWS services. The use of VPC Endpoints can help make these communications more secure: VPC Gateway Endpoints and VPC Interface Endpoints enable private access to public AWS services over the AWS backbone in a way that avoids traversing internet infrastructure. VPC Endpoints can be further secured by Endpoint Policies, which help control access to the service behind VPC Endpoints.

Securing ingress traffic to applications that run on Amazon EKS is another surface area. AWS PrivateLink can be used to implement an “application provider and consumer” mode. Creating Amazon VPC Endpoint Services as a secure mechanism to share and enable access to Kubernetes applications can help secure that ingress activity. Equally, consumers are protected from the application they consume, as AWS PrivateLink allows traffic initiation from the consumer side, but not the other way around.

VPC networking options within Kubernetes clusters have an impact on security as well. Kubernetes Network Policies provide a method to isolate Pod communications and control traffic flow at the IP address or port level within a cluster, adding a layer of networking security to help prevent unexpected communication between Pods that are unnecessary from an application perspective. Amazon EKS supports VPC native networking with the Amazon VPC CNI plugin, which itself does not support Network Policies; hence defining NetworkPolicy resources within Amazon EKS would have no effect. This leads some organisations to adopt alternative CNI plugins and trade-off support for Network Policies with VPC native networking features and AWS Fargate interoperability. Amazon VPC Lattice may grow into providing options for Kubernetes Pod and Service networking.

Traffic Inspection: The AWS native option for traffic inspection is AWS Network Firewall, which provides a managed service based on the Suricata intrusion detection and prevention project. AWS Network Firewall can intercept, inspect and filter traffic at the perimeter of the VPC before admitting traffic to proceed towards VPC Subnets and other downstream security constructs such as NACLs and SGs: AWS Network Firewall filters traffic going in and out of VPCs through Internet Gateways, NAT gateways, Virtual Private Gateways and AWS Direct Connect connections.

Adopting a pattern in which traffic to and from Kubernetes applications is centrally inspected by security appliances hosted within a dedicated “Inspection VPC” can be very effective. Alternative inspection architectures with dedicated AWS Network Firewalls, for example, can also be suitable, depending on requirements.

As a sidebar, notice how the “bump in the wire” pattern with AWS Network Firewall does not quite interwork with the AWS PrivateLink pattern. Traffic entering the VPC through PrivateLink is directed towards the provider-side NLB, which load-balances traffic across its target groups.

Scaling: AWS infrastructure-centric security services are equally built for scale and integrate with AWS Organisations and Delegated Administrator features, which help with governance and compliance across multiple firewall options and rules for a growing number of Amazon EKS clusters. AWS Firewall Manager, for example, is a security management service that helps configure, manage, and enforce firewall rules across existing and newly created resources and accounts, including VPC SGs and NACLs, AWS Network Firewalls, Amazon Route53 Resolver Firewalls, and AWS WAFs.

5. Incident response

Incident response is the process of responding to a reported security finding or incident. In the context of AWS native security services, findings would be reported through AWS Security Hub and a combination of Amazon EventBridge and Amazon SNS notifications.

The unfolding sequence of activity typically falls into two archetypes. The first is triggering an investigation, gathering and analysing evidence, establishing a root cause, and creating and executing a threat response and remediation plan. Employing the AWS native option with AWS GuardDuty and Amazon Detective helps security analysts achieve that. The second archetype is about authoring automated responses, where some finding types may admit the implementation of appropriate automatic remediation actions.

AWS Security Hub with Amazon EventBridge can be the central point for defining and orchestrating automated responses and remediations. Amazon EventBridge rules are pattern-match definitions mapped to target types. Target types are another AWS service, such as an AWS Lambda function. Targets are triggered when Amazon EventBridge receives an event that matches the defined rule pattern. AWS Security Hub also supports custom actions that send findings or insight results to Amazon EventBridge. Once defined, custom actions are emitted into Amazon EventBridge, from where rules and targets are defined.

Being deliberate and intentful when building new auto-remediation capabilities will help direct efforts towards low-complexity, deterministic, and repetitive remediations.

Conclusions

Achieving Kubernetes security goals on AWS with AWS native security services can be a very practical approach: The portfolio of AWS native security services is increasing its relevance for securing Kubernetes workloads. Established security services and recent service additions are embedding growing support for Kubernetes audit logs and container security features, spanning across supply chain security, vulnerability management, and runtime security.

Services such as Security Hub now support the notion of conformance or compliance assessments for security standards and best practices, with increasing support for Kubernetes-related considerations. Managed services that scale with the cloud footprint without necessarily requiring an equally growing security team can help build an efficient and effective security practice.

--

--

Dirk Michel

SVP SaaS and Digital Technology | AWS Ambassador. Talks Cloud Engineering, Platform Engineering, Release Engineering, and Reliability Engineering.