Pod Identity

Krishnakumar R
Microsoft Azure
Published in
7 min readAug 6, 2019

Aad-pod-identity is a Kubernetes native way to represent cloud identity, configure pods to have identities associated with them, and facilitate applications inside them to access cloud resources and services.

In Azure, users can use the cloud-based identity and access management service — Azure Active Directory (AAD) to secure access to their application resources. AAD can also be used to secure and regulate access to Azure resources and services (eg: Azure Key Vault, Azure CosmosDB, storage etc.) used by applications. Traditionally, in the AAD world, applications are represented with Service Principal (SP) which is analogous to Operating System service account. SP supports most of the functional requirements but falls short when it comes to requirements such as application password handling, rotation, etc. Managed Identities feature in AAD improves this situation by eliminating credential handling by applications altogether. There are two types of Managed Identities (or MSI as it was known earlier): System Assigned Identity and User Assigned Identity. The basic premise is that system/user assigned identities are assigned to the underlying VM/VMSS instance in advance. Applications use these managed identities in conjunction with Azure Instance Metadata Service (IMDS) to access the cloud resources and services.

While this works when one VM instance hosts one application instance, this fails to deliver in high-density environments like Kubernetes where a single VM may host multiple application instances. The questions which arises now are, “What happens to the applications running in a pod on a Kubernetes cluster? How can they dynamically use the managed identity paradigm to access the Azure cloud services ?” This is where the aad-pod-identity comes into the picture. Simply said, aad-pod-identity makes the managed identities available at a pod level, without any application modification.

When a pod is scheduled to a node, aad-pod-identity ensures that a pre-configured user assigned identity is assigned to the underlying VM/VMSS. Any application traffic to obtain a token from IMDS is intercepted by aad-pod-identity and a token is returned based on the pre-configured identity. Once the basic identity generation and role assignments are performed in Azure, aad-pod-identity provides Kubernetes native ways to do the rest of the configuration and management such as associating the pods to identities, creating exception lists etc.

How to get started?

The Getting Started steps in aad-pod-identity README provides a quick way to set up and configure the project in your Azure hosted Kubernetes cluster.

Here are some details which will help you get started quickly on the project. An identity is represented in Kubernetes as an AzureIdentity Custom Resource Definition (CRD). The labels on the pods are used to indicate the association to cloud identities. These associations are represented by AzureIdentityBinding CRD. The AzureAssignedIdentities are internal CRDs created when a match is found between bindings and the pod labels. The below diagram depicts the relationships.

How does it work?

There are two main components of the aad-pod-identity - MIC (Managed Identity Controller) and NMI (Node Managed Identity).

MIC keeps track of the pods that are created, deleted and updated via Kubernetes go client(client-go) cache. The client-go keeps the local cache in sync with the Kubernetes API server. When a pod gets scheduled to a node and an identity match is found via pod labels, MIC contacts Azure Resource Manager (ARM) to assign the user assigned identity to the VM/VMSS. When these pods are removed from the node, MIC will remove the user assigned identity from the underlying VM/VMSS.

NMI is responsible for redirecting all application traffic which are going to the Azure Instance Metadata Service (IMDS). NMI uses iptables on Linux to achieve this.

AAD pod identity architecture
AAD pod identity architecture

When a token request reaches NMI, it looks up the AzureAssignedIdentity cache listing to determine if there is a matching identity for the pod making the request. If there is one, it gets the token based on this identity and provides the token to the application. With this token, application can successfully access the cloud resource.

What’s in the latest release?

We recently released version 1.5 for aad-pod-identity project. The following is an overview of features and improvements included in this release:

Improved error handling

In this release we introduced states for AzureAssignedIdentity. CREATED is the initial state, ASSIGNED is when the user assigned identity is assigned to the underlying VM/VMSS, and UNASSIGNED is the state where the identity is removed from the underlying VM/VMSS but AzureAssignedIdentity itself is not deleted.

The state based error handling improves the scenarios where an error from Azure API calls would result in the identities being left around on the VM/VMSS. Introduction of states also improves the resiliency with which NMI returns the token to the application. This point is detailed in the section “Resiliency improvement in token fetching”.

Authentication features and improvements:

By default, aad-pod-identity uses cluster credentials to access services in the cloud for actions such as assigning and removing identities. With this release we introduced support for system assigned MSI clusters.

It’s recommended to use separate credential settings for the aad-pod-identity when accessing cloud services to have ‘separation of concerns’ with the cluster credentials. We introduced the ability for aad-pod-identity to use a different set of credentials than the cluster credentials. More details can be found here.

Identity assignment performance improvements

In a single cycle of MIC, it can pick up multiple identities to be assigned to the underlying VM/VMSS. We enhanced this process by consolidating the operations per node and performing them in parallel.

Multiple replicas for MIC

We introduced Kubernetes-based leader election in MIC. MIC will now run with two replicas in an active/passive model. If the active instance of MIC fails, then the passive one will attempt to win the leader election and take over the MIC activities thus providing more resiliency for the aad-pod-identity infrastructure.

Health probe and state reporting

We introduced a health probe into both MIC and NMI components. The health probe is configured to be available on /healthz endpoint exposed on the pod network. This health probe is utilized by Kubernetes to determine the health of the component and restart the pods if they are unhealthy. Additionally, we have introduced state reporting via the /healthz. In case of MIC the /healthz will report ‘Active’ state when the instance is elected leader and is ready. In case of NMI, ‘Active’ state is reported when iptable rules for redirection of IMDS traffic are added.

Reliability improvements in token fetching

The process of assigning the user assigned identity to the underlying VM/VMSS is a time consuming task. If a pod requests a token as soon as it’s deployed, then the process of assignment may not have completed. This leads to errors and retry requirements at the application level. In this release we added internal retries in NMI, which would wait until the assignment is completed to respond to the token request. This feature utilizes the state of AzureAssignedIdentity to conclude whether the underlying operation is still in progress. This retry also accounts for the context provided by the user application. In case the application needs a quick reply without waiting for the entire retry time period, a context with appropriate deadline/timeout can be set in the calling application. Overall these changes make the token fetching calls more reliable.

Application exception capability

When aad-pod-identity is enabled in a Kubernetes cluster, every application trying to fetch a token from IMDS is intercepted by NMI. However, there are instances when applications deployed in the cluster need to talk to IMDS directly without being intercepted by aad-pod-identity. To support this, we have introduced a new CRD ‘AzurePodIdentityException’ using which users can configure which applications need to be excluded.

For any intercepted traffic NMI will lookup the ‘AzurePodIdentityException’ list to see if the label defined in the source/application pod matches the labels defined in the CRD. In case of a match, NMI will fetch the token on behalf of the application without involving any aad-pod-identity configured identities More details can be found here.

Enable caching

Both MIC and NMI makes calls to the Kubernetes API server to get the listings of various CRDs. In order to reduce the load on Kubernetes API server, we have switched to using client-go cache.

Other bug fixes and improvements

This release consists of several other bug fixes and improvements. Here are few of them — cleanup iptables when the NMI exits, updated azure sdk and alpine base images to newer versions, init container support etc. More details about other improvements and fixes can be found in the change log.

Where do we go from here?

Currently, we are preparing to start the work on 1.6 release, details of which can be found in in the Github project. Availability of aad-pod-identity on Windows K8s clusters as well as providing aad-pod-identity as a built-in plugin in Azure Kubernetes Service (AKS) are other exciting features which we are planning to bring to our users in the future.

How to participate?

Aad-pod-identity is run as an open source project. Head to Github to start participating. Many items which are part of 1.5 release, have been contributed by our enthusiastic users, for example: support for init containers, improvements in usage for service principal, marshalling token response fixes for usage across multiple azure sdk. We are thankful and very proud of those contributions. We look forward to more involvement from the community.

Acknowledgments

This article has been co-authored by Anish Ramasekar.

Thank you Craig Peters and Kal - UNBREAKABLE for reviewing drafts of this article and providing valuable comments.

Thank you to all who contributed code for this release as well as older releases. Thank you to all users who have been using aad-pod-identity and providing valuable feedback.

Looking forward to your feedback & support!

Krishnakumar is a Senior Software engineer in the Azure Kubernetes upstream team. Follow Krishnakumar on Twitter at https://twitter.com/kkwriting . Anish is a Software engineer in the Azure Kubernetes upstream team. Follow Anish on Twitter at https://twitter.com/AnishRamasekar .

--

--