How auto rotating certificates reduces toil for our engineering teams

Tom Scott
ASOS Tech Blog
Published in
5 min readMar 15, 2023

One of the main goals of the Principal Engineering team here at ASOS is to reduce toil and make ASOS a better place for our engineering teams. Toil can be found in many forms — one of which, for ASOS, has been certificate renewals. With over 500 micro services in production, its almost inevitable from a communication and co-ordination perspective that something would go wrong with lots of teams involved.

Hosting technologies at ASOS

Across tech at various times we have used Azure Cloud Services, Service Fabric, App Services, Functions and Kubernetes (AKS) to name a few. After dropping support for Cloud Services and Service Fabric, we wanted to solve this problem and fully automate certificate rotations across our complete estate.

We had 3 distinct technologies where this problem would need to be solved: Kubernetes, Azure Application Gateway and App Services — which includes Functions.

Controlling access to certificates in Azure Key Vault

The first part to solving this problem was to provision a non-production and production Azure Key Vault which will only contain certificates that teams will use for TLS purposes. Previously certificates were replicated in many systems— and we wanted all teams to migrate to one set of Key Vaults to ease the maintenance overhead.

We then needed to secure access to the Key Vaults via Azure RBAC and only allow read access for particular Service Principal Name (SPN) whilst also granting wider access to the limited number of people responsible for managing the certificate lifecycle.

To allow SPNs the correct role to read certificates within a deployment template we needed to assign the built in role of ‘Microsoft.KeyVault/vaults/deploy/action’. For Azure App Services, there is an Azure provided single Managed Identity with static Application Identifiers and unique object identifiers per tenant. These are named ‘Microsoft Azure App Service’ and ‘Microsoft.Azure.CertificateRegistration’ — so we can easily grant access for all of our Azure App Service instances under multiple subscriptions with the same TenantId!

For Azure Application Gateway and Kubernetes we will need to grant access to each Managed Identity per resource. To solve this, we created a AAD group per Key Vault where our provisioning pipelines can add the Managed Identity for each resource which will be assigned the ‘deploy action’ role.

Azure App Service

With the deployment SPN having access to the Key Vault, you will need to setup your App Service to:

  1. Enable System Assigned identity
  2. Include a step in your provisioning pipeline to reference the Key Vault to obtain a certificate reference — this will ‘import’ the certificate into the App Service
  3. Setup the TLS SNI binding as normal by referencing the imported certificate

For production use you would likely want to use a tool such as Bicep or Terraform but you can use the Azure CLI to achieve this:

az account set --subscription subscriptionNameaz webapp config ssl import --resource-group rename 
--name appServiceName
--key-vault /subscriptions/subscriptionId/resourceGroups/rgName/providers/Microsoft.KeyVault/vaults/keyVaultName
--key-vault-certificate-name certificateName

The next time the certificate is updated in Key Vault, your App Service will pick up the change — simple!

Azure Application Gateway

For Application Gateway it is a similar process to App Service:

  1. Create a User Assigned Managed Identity
  2. Assign the identity to the Application Gateway
az network application-gateway identity assign -g rgName --gateway-name gatewayName --identity resourceIdForManagedIdentity

3. As discussed above — the Managed Identity needs to be assigned permissions for the Key Vault

4. Create a certificate for the Application Gateway — this will ‘import’ the certificate

az network application-gateway ssl-cert create -n 'sslCertName' -g rgName --gateway-name gatewayName --key-vault-secret-id certificateSecretUrl

5. Consume your certificate as normal in your Application Gateway listener

Again, the next time the certificate is updated in the Key Vault, your Application Gateway will pick up the change!

Kubernetes — Traefik and Nginx

It’s a slightly more complex scenario for Kubernetes when using a ingress such as Traefik or Nginx as there will be no Azure managed process here.

Instead we used the Secrets Store CSI Driver for Kubernetes to sync a certificate between the Key Vault and a Kubernetes secret. Like the other technologies discussed above, this works using Managed Identity. If your using AKS then the Secrets Store CSI Driver add-on is now generally available.

With the correct read permissions assigned to the Managed Identity for the CSI driver we can declare a ‘SecretProviderClass’ that will sync the certificate from Key Vault to a Kubernetes secret:

apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: {{ .Values.ingressCertificateName }}
namespace: {{ .Values.ingressNamespace }}
spec:
provider: azure
secretObjects:
- secretName: {{ .Values.secretName }}
type: kubernetes.io/tls
data:
- objectName: {{ .Values.ingressCertificateName }}
key: tls.key
- objectName: {{ .Values.ingressCertificateName }}
key: tls.crt
parameters:
usePodIdentity: "false"
useVMManagedIdentity: "true"
userAssignedIdentityID: {{ .Values.managedIdentityClientId }}
keyvaultName: {{ .Values.ingressCertificateKeyVaultName }}
objects: |
array:
- |
objectName: {{ .Values.ingressCertificateName }}
objectType: secret
tenantId: {{ .Values.tenantId }}

If you have worked with the CSI architecture before, you will know that the ‘SecretProviderClass’ will not pull the secret until the object is used — in this case it is required that we mount it to a Pod:

volumes:
- name: tls-volume
csi:
driver: secrets-store.csi.k8s.io
readOnly: true
volumeAttributes:
secretProviderClass: {{ .Values.ingressCertificateName }}
volumeMounts:
- name: tls-volume
mountPath: /etc/ssl/private
readOnly: true

With a pod successfully created, the ‘SecretProviderClass’ will create the Kubernetes secret that is specified in the ‘secretObjects’ array.

A couple of notes on the behaviour of the CSI driver:

  1. you need to configure how often the CSI driver polls the Key Vault for updates and if mounts will be updated when the secret changes
  2. binding the volume directly to a ingress pod adds a runtime dependency to the availability of the Key Vault — therefore we choose not to mount it to the ingress pod and instead mount it to a pod which only exists for this purpose
  3. The setup of point 2 allows the ingress pods to scale or restart because they consume the Kubernetes secret which remains available during this scenario

The final part is to have the ingress pod consume the Kubernetes secret for TLS purposes. For Traefik we create a default TLS store that references the Kubernetes secret the ‘SecretProviderClass’ created:

apiVersion: traefik.containo.us/v1alpha1
kind: TLSStore
metadata:
name: default
annotations:
helm.sh/hook: "post-install,post-upgrade"
spec:
defaultCertificate:
secretName: {{ .Values.ingress_certificate_name }}

And for Nginx, we specify the secret name via the ‘extraArgs’ in the Helm chart:

extraArgs: default-ssl-certificate: "{{ nginx_namespace }}/{{ secret_name }}"

Conclusions and Shout Outs

This has solved a reoccurring problem that added unnecessary toil to our engineering teams — once it's fully rolled out engineering teams shouldn’t be routinely involved in certificate rotations and incidents caused by this should be reduced or even eliminated.

Big shout out to my colleague Dylan Morley who set this all in motion as well as our AKS maintainers team for supporting my work on our AKS provisioning pipeline.

Useful Links

Azure CSI Secrets Store Driver

Set up Secrets Store CSI Driver to enable NGINX Ingress Controller with TLS

Application Gateway with Azure Keyvault

About Me

I’m Tom Scott, a Principal Software Engineer at ASOS. I mainly work with teams on back-end APIs and event handling systems at scale that enable much of our core shopping experience.

--

--