Problems using Secrets Store CSI driver and securing your Kubernetes real-estate.
In this follow on post from my original which discussed ingress methods for exposing internal Kubernetes services to the public networks using ingress objects. I’m writing about some of my journey i’ve had learning to administer and setup production grade K8s clusters and the steep learning curve i’ve had over the months. Some of these were just natural processes but a lot of them had root causes with products just not working as described in real world applications. Original intro post can be found here.
This second article handles how to manage and protect secrets within a cluster and modern working environment with multiple teams and services needing access to them. These were specific to my user cases. Having attempted to implement two methods I describe some of the challenges around them.
What this blog post isn’t
This isn’t a step by step how to guide because there are plenty of them out there which show how to run deployments, create secrets, what they are how to use them. Instead i’ll leave technical reference links where needed. You need to have a certain amount of basic Kubernetes maturity in your knowledge.
User specific requirements
I want my pods and applications to be able to use any type of secret for running their workloads. This can range from username, password, url address private keys etc and they shouldn’t be saved in plain text in a yaml or config files. Pretty standard stuff for application needs. I want these to be obfuscated to the engineers/teams so the risk of exposing these secrets is minimised. They don’t need to know what the content of the secret is, only that it needs to be used by the pod or container to perform a work load. At the same time I want to be able to restrict access to these secrets with some level of fine grained controls, ie limit them to dev/prod, namespace, individual users. This means that I can enforce some basic security postures like separate passwords for dev, test and prod. See org architecture
Looking for a solution and applying this to Kubernetes environment actually worked out with me having to go on a long journey (and reminder) about good security postures and some of the trade-offs that are inevitable when securing any secret. It also led me to unravel how some of the sales material that cloud providers provide sometimes mislead people into thinking that their applications are super secure by using simple products.
Kubernetes is quite a complex technology and you need to really understand all of the components that allow it to be so versatile and powerful. You need to know where data is stored and written on disk, how networking and routing works, how the apis are secured and what type of authN and authZ they use. This ultimately helps with identifying where potential security risks could come from and what you want to place importance on.
Ultimately, what ever posture is taken up it boils down to the simple fact that secrets are stored somewhere in memory and are unencrypted at some point during their lifecycle so that the processes or application can make use of them. Ie a container needs to use a plain text version of DB credentials to access and write data to it. Even if those DB credentials are encrypted at rest in AWS Secrets Manager and then in transit by TLS certs, the container still has access to a certs and encryption key to use the DB credential. Knowing where these vulnerabilities exist or how large the attack surface is is more important then blindly using the latest technology and buzzwords and operating under the false presumption your whole cluster is super secure.
Nothing is truly secure in your Kubernetes cluster
There is a great article from Mac Caffe about how some of the technologies are often missold or misinterpreted. He does an incredible job of explaining where and how secrets are maintained and stored in places like etcd on the control plane. Highlighting where each exploit vulnerability exists. If you want to know the details of each security based component is and how it can infiltrated this provides some great insight.
https://www.macchaffee.com/blog/2022/k8s-secrets/
The purpose of this article is to highlight how two of the secrets handling mechanisms advertised by AWS can be used.
You no doubt already know secrets are stored in base64 encoded format on etcd in the control plane nodes. The secrets object was originally designed with the same ethos k8s was developed for in Google, that is for developers to configure and declare states of for containerised distributed application code to communicate with each other. So originally all pods and containers operated in a completely trust environment, hence all pods can have network connectivity to each other by default and service discovery is so easy.
As Mac Chaffee suggests in his blog, threat modelling is always a good place to start when considering security.
What are we trying to protect?
Misconfiguration or unintended leaks of any type of secrets that our engineers and applications code might need to run within a distributed microservice architecture.
What does failure look like?
Secrets being published to a repo and then being leaked somewhere else.
Engineers being able to see them in plain text and be aware of their content on a day-to-day basis which can lead to accidental leaks.
Production secrets used in development environments or visa-versa.
How can we prevent those attacks?
Using config files instead of plain text secrets manifests
Applying least privileged access rules
Obfuscating the technology that injects or mounts those secrets away from BAU
The solutions
What you’ll need
- kubectl
- EKS cluster with OCDI activated
- helm(optional)
- VPC with at least 2 private subnets and DNS hostnames and DNS Resolution activated
- (Optional)Terraform to deploy yaml manifests and Helm charts
Secrets Store CSI driver and ASCP
Using the step in on the AWS documentation here I attempted to install the relevant pods using the helm chart.
https://docs.aws.amazon.com/secretsmanager/latest/userguide/integrating_csi_driver.html
AWS Secrets and Configuration Provider (ASCP) works by mounting the secrets as files to a volume mount and allowing the pod to access that volume. The idea is that you can manage your secrets lifecycle externally in AWS Secrets Store or Param store. They are stored and rotated here. The pods then import them into the mounted volume.
After applying the helm charts I can see the two daemonsets are created.
First reservation I have with this already is we need two daemon sets to secure and manage secrets for application code? This seems like wasted resource. Using the plethora of different K8s technologies there seems to be either pods, daemonsets or CRDs needed to run them all. Thats understandable but these all need to consume resources ie cpu and memory. When these are all added up (PV management, Autoscaling, network management) it starts added. They all need to run on each node, pretty soon you need to start thinking about using larger node groups. Keeping nodes management with light and minimal resources is good practice so you don’t rack up those AWS ec2 bills.
As I continued with the implementation process I see errors manifesting when trying to deploy a simple pod to try and access the secrets. Seems somewhere there are problems setting up and accessing the volume mount. The problem persisted and despite attempts to debug and search stackflow for the answers I was unable to get this working. I even tried using Terraform to create the Service accounts and IAM policies in a more familiar language and framework to me. Still yielding no better results.
args [1]
0 : --provider-volume=/etc/kubernetes/secrets-store-csi-providers
image : public.ecr.aws/aws-secrets-manager/secrets-store-csi-driver-provider-aws:1.0.r2-50-g5b4aca1-2023.06.09.21.19
imagePullPolicy : Always
name : provider-aws-installer
resources {2}
limits {2}
cpu : 50m
memory : 100Mi
requests {2}
cpu : 50m
memory : 100MiMountVolume.SetUp failed for volume "secrets-store-inline" : rpc error: code = Unknown desc = failed to mount secrets store objects for pod ns/secrets-csi-test-34543q5-345, err: rpc error: code = Unknown desc = Failed to fetch secret from all regions: arn:aws:secretsmanager:eu-west-2:...:secret:secret-api-key-PVnQfEAfter sifting through more AWS workshops and trying multiple attempts I eventually had to give up. I can try something completely different and go with HashiCorp Vault for this?
External Secrets Operator
I came across an alternative from AWS archives which uses Kubernetes External Secrets object. This differs from the ASCP because it uses a native Kubernetes objects which means no volumes need to mounted to the pod.
Main components:
- External Secret object = Native Kubernetes object which allows secrets to be managed externally ie Vault,Param store, Secrets store etc
- External Secrets Operator (ESO) = A Kubernetes Custom Resource (CRD) which in charge of interactions with the external systems listed above. It fetches the secrets from them and creates these as External Secret objects within the cluster.
Added bonuses are that pushing of secrets outbound to supported service are also present, so if you were to migrate from one service to another you can do it with relative ease.
Documentation and git repos on ESO can be found here
Using the guides and launching the helm chart was successfully showing the CRD had been created. From this we’d be using “secrets store” and “external secrets”.
Quick check of CRD’s created by the helm chart verifies they were created
kubectl get crdNow you can scope the limitation you want to put onto your pods and the secrets they have access to using Kubernetes Service Accounts and IAM roles for those to use.
As illustrated in the org diagram, i’m using AWS resource tags and the IAM roles to scope those down even further so only certain SA’s have access to tagged secrets ie the Adtech team and Admin/Devops team have access to a secrets store that can only retrieve Secrets Manager resources tagged “namespace=adtech-tools”
Applying the manifest file below creates:
- Namespace
-SA with the pre-provisioned IAM role attached
-Secrets store referencing the SA which retrieves the actual secret from AWS Secrets Manager
-External Secrets target object to create with that secret(locustlogin). Here you can reference the AWS Secret manager name and any relevant keys value pairs you want to access. In my case the contents of the AWS key manager were “{“username”:”admin”,”key”:”SeCure@ApIKey”}”
apiVersion: v1
kind: Namespace
metadata:
name: adtech-tools
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: external-secrets-test-sa
namespace: adtech-tools
annotations:
# role that service account uses created by terraform module
eks.amazonaws.com/role-arn : arn:aws:iam::1234567:role/chet-secrets-manager
---
apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
name: aws-secret-store
namespace: adtech-tools
spec:
provider:
aws:
service: SecretsManager
region: eu-west-2
auth:
jwt:
serviceAccountRef:
name: external-secrets-test-sa
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
name: eks-external-secret
namespace: adtech-tools
spec:
refreshInterval: 10m
secretStoreRef:
name: aws-secret-store # name of the SecretStore (or kind specified)
kind: SecretStore
# what to create in k8s secret
target:
name: locustlogin
# get remote secrets data from provider
data:
- secretKey: username
remoteRef:
key: secret-api-key
property: username
- secretKey: key
remoteRef:
key: secret-api-key
property: keyIf all goes well you can check the external secret has synced correctly using command. It should show the secret store created and external secret
kubectl get externalsecrets.external-secrets.io -n adtech-toolsIf you get SyncError its likely something to do with either the wrong names or the SA and IAM role setup. Go back, debug and check all of this is correctly configured.
Your IAM role for your service account should look similar to the below.
{
"Action": [
"secretsmanager:ListSecretVersionIds",
"secretsmanager:GetSecretValue",
"secretsmanager:GetResourcePolicy",
"secretsmanager:DescribeSecret"
],
"Effect": "Allow",
"Resource": "arn:aws:secretsmanager:eu-west-2:1234567:secret:secret-api-key-PVnQfE",
"Sid": ""
},Conclusion
This has now created Kubernetes Secret object which you can reference the name of in your helm charts or manifests without disclosing the actual content of it in plain base64. Its now namespace scoped so you can use RBAC to control access to it for each business unit.
Of course the secret still exist in the control plane on etcd but only cluster admins users and pods/services with relevant permissions can view them. If you have access to the container and relevant permissions set by RBAC then you view and use them aswell.
Your pods can now consume k8s secrets objects in how they would normally with a layer of obfuscation to prevent leaks.
The ESO supports key rotation, pushing secrets and param store which is something thats free to use with AWS (no automatic key rotation feature).
