Using Azure Managed Identities for Grafana on Azure Kubernetes Service

Antonela Cukurin
7 min readNov 21, 2023

--

Observability is a fundamental aspect of modern, cloud-native architectures. It provides critical insights into the behavior and performance of applications and infrastructure. The Grafana Stack, comprising Mimir, Loki, and Tempo, provides a powerful suite of tools to achieve comprehensive observability. When used in conjunction with Azure Managed Identities for authentication and Azure Storage Accounts for data storage, these tools offer a secure, integrated, and efficient observability solution. This article explores how these components harmoniously interact within an Azure environment.

Why use Azure Managed Identities with Grafana Stack?

Grafana is an open-source platform for data visualization, monitoring, and analysis. In order to store and retrieve metrics, logs, and traces, Grafana needs to be connected to a backend storage system.

Azure Storage is a popular choice for backend storage because it is a scalable and reliable cloud-based storage solution. However, to access Azure Storage, you need to provide key in your Grafana configuration files. This can be a security risk because the credentials can be easily exposed or compromised.

Using Azure Managed Identities with Grafana Stack provides a more secure and easier way to configure your Grafana OSS stack with backend Azure storage accounts. With Managed Identities, you don’t need to store keys in your Grafana configuration files. Instead, you can grant an identity to your Grafana instance, which can then be used to access your Azure Storage account.

What is Azure Managed Identity?

Azure Managed Identity is a secure and easy way to authenticate your resources in Azure. They provide a seamless way to authenticate to Azure resources, without the need to store keys in your configuration files. Managed Identities are a feature of Azure Active Directory that allows you to grant an identity to an Azure resource, such as a virtual machine or an Azure function, which can then be used to access other Azure resources.

How to use Azure Managed Identities with Grafana Stack?

The Grafana Stack is an open-source suite of observability tools, including:

  • Mimir: A data management tool that simplifies metrics’ ingestion, storage, and distribution.
  • Loki: A horizontally-scalable, highly-available, multi-tenant log aggregation system.
  • Tempo: A high-volume, high-cardinality distributed tracing system.

There are different ways in which client applications can request managed identity application-only tokens. You can check all options in Microsoft documentation.

The way Mimir, Loki and Tempo applications are doing it is with HTTP, where applications are making HTTP REST calls to the Azure Instance Metadata Service (IMDS) endpoint to get a token. IMDS is a RESTful web service provided by Azure that allows applications running on virtual machines (VMs) within Azure to access information about the VM’s network configuration, environment variables, and metadata about the VM itself.

GET ‘http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https://management.azure.com/' HTTP/1.1 Metadata: true

In case your node pool doesn’t have access to managed identity you will get error similar to this:

Here is visible that the application is calling the IMDS endpoint and returning 400 Bad Request error.

To use Azure Managed Identities with Grafana Stack, you need to follow these steps:

Step 1: Enable Managed Identity for your AKS cluster

Create, enable and assign a control plane and kubelet managed identities to your AKS cluster:

A custom user-assigned managed identity for the control plane enables access to the existing identity prior to cluster creation.

# Create a user-assigned MI for the control plane
az identity create
--name <managed_identity_name> \
--resource-group <resource_group_name> \
--location <region>
# Enable managed identity in your AKS cluster
az aks update \
--name <cluster_name> \
--resource-group <resource_group_name> \
--enable-managed-identity \
--assign-identity <managed_identity_id>

AKS creates a user-assigned kubelet identity in the node resource group if you don’t specify your own kubelet managed identity.

# Fetch kubelet Cluster identity
CLUSTER_MI_KUBELET_ID=$(az aks show --name <cluster_name> --resource-group <resource_group_name> --query “identityProfile.kubeletidentity.objectId” -o tsv)
CLUSTER_MI_KUBELET_CLIENT_ID=$(az aks show --name <cluster_name> --resource-group <resource_group_name> --query “identityProfile.kubeletidentity.clientId” -o tsv)

Show command will provide client and object IDs, which are used for different purposes.

"identityProfile": {
"kubeletidentity": {
"resourceId": "/subscriptions/<subscription_id>/resourcegroups/<resource_group_name>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<cluster_agent_pool>",
"clientId": "CLUSTER_MI_KUBELET_CLIENT_ID",
"objectId": "CLUSTER_MI_KUBELET_ID"
}
}
# Fetch Control Plane cluster identity
CLUSTER_MI_CONTROL_PLANE_ID=$(az aks show --name <cluster_name> --resource-group <resource_group_name> --query “identity.userAssignedIdentities.*.principalId” -o tsv)
# Fetch AKS node pool resource group
CLUSTER_RG_AKS_NODES=$(az aks show --resource-group <resource_group_name> --name <cluster_name> --query "nodeResourceGroup" -o tsv)
# AKS node pool and controle plane role assignments
az role assignment create --role "Managed Identity Operator" \
--assignee-object-id $CLUSTER_MI_KUBELET_ID \
--assignee-principal-type ServicePrincipal \
--scope "/subscriptions/<subscription_id>/resourceGroups/$CLUSTER_RG_AKS_NODES"

az role assignment create --role "Virtual Machine Contributor" \
--assignee-object-id $CLUSTER_MI_KUBELET_ID \
--assignee-principal-type ServicePrincipal \
--scope "/subscriptions/<subscription_id>/resourceGroups/$CLUSTER_RG_AKS_NODES"

az role assignment create --role "Contributor" \
--assignee-object-id $CLUSTER_MI_CONTROL_PLANE_ID \
--assignee-principal-type ServicePrincipal \
--scope "/subscriptions/<subscription_id>/resourcegroups/<cluster_resource_group>"

The managed identity used for the Grafana stack needs to have access to the AKS node pool. You can achieve it with the usage of the “Storage Blob Data Contributor” RBAC role and provide node pool subnet access to your storage account.

# Add network-rule to allow pods subnet to storage accounts
SUBNET_ID=$(az network vnet list --resource-group $CLUSTER_RG_AKS_NODES --query "[].subnets[?name=='aks-subnet'].id" -o tsv)

az storage account network-rule add --account-name <loki_storage_account> --resource-group <loki_storage_account_rg> --subnet $SUBNET_ID
az storage account network-rule add --account-name <mimir_storage_account> --resource-group <mimir_storage_account_rg> --subnet $SUBNET_ID
az storage account network-rule add --account-name <tempo_storage_account> --resource-group <tempo_storage_account_rg> --subnet $SUBNET_ID

Step 2: Grant access to your Azure Storage account

Storage accounts SKU Premium is having 3 types of storage services: Block blob, File share and Page blob. In the case of Mimir, Loki, and Tempo block blob storage should be used.

Premium block blob storage account is a type of Azure storage account that provides high-performance, low-latency storage for big data analytics, high-performance computing workloads, and other data-intensive applications. It is designed to deliver consistent and predictable performance for workloads that require high input/output operations per second (IOPS) and low latency. Premium block blob storage account provides faster transaction rates and lower latency than standard storage accounts.

With a Premium Block Blob storage account, you can store large amounts of unstructured data.

Step 4: Configure Grafana: Mimir, Loki and Tempo to use Azure Storage with Managed Identity

Loki and Tempo have a documented usage of managed identity type for authentication, on the other hand, Mimir from documentation seems it doesn’t have it enabled.

First, we will focus on Loki and Tempo configuration files. What will be obvious from examples is that Grafana stacks don’t have consistent storage accounts container creation rules. For Loki and Tempo, all containers need to exist prior or you will get an error. For additional parameters check configuration documentation for reference.

Loki configuration:

storage:
bucketNames:
chunks: loki-chunks # Needs to pre-exist
ruler: loki-rulers # Needs to pre-exist
admin: loki-admin # Needs to pre-exist
type: azure
azure:
accountName: <loki_storage_account>
userAssignedId: "$CLUSTER_MI_KUBELET_CLIENT_ID"
endpointSuffix: blob.core.windows.net
useManagedIdentity: true

Tempo configuration:

storage:
trace:
backend: azure
azure:
container_name: tempo-traces # Needs to pre-exist
storage_account_name: <tempo_storage_account>
user_assigned_id: "$CLUSTER_MI_KUBELET_CLIENT_ID"
use_managed_identity: true
endpoint_suffix: blob.core.windows.net

When it comes to Mimir it is a little bit more tricky to understand what is available in configuration. For Loki and Tempo you can explicitly state to use “use_managed_identity”. In the case of Mimir, the parameter doesn’t exist and containers don’t need to pre-exist. Looking at the Mimir helm chart we can see that the “config” structure is having “UserAssignedID” field and it is used in the “newBucketClient” function. This proves that we can use managed identity for Mimir backend storage with Azure.

Mimir configuration:

structuredConfig:
common:
storage:
backend: azure
azure:
account_name: <mimir_storage_account>
endpoint_suffix: blob.core.windows.net
user_assigned_id: "$CLUSTER_MI_KUBELET_CLIENT_ID"
blocks_storage:
backend: azure
azure:
container_name: mimir-blocks

Step 5: Verify if the new containers are created in Storage Accounts

You should be able to see the “*seed.json” file and directories where chunks are stored in your containers for Mimir, Tempo and Loki.

Conclusion

The Grafana Stack, consisting of Mimir, Loki, and Tempo, offers a robust solution for achieving comprehensive observability. When used in Azure with Managed Identities and Azure Storage Accounts, these tools can securely access and interact with Azure resources without the need to manually manage credentials. This enhances security, simplifies deployment, and streamlines operations. Open-source tools and projects can be difficult to configure in a way to fit your demands but with a little bit of digging and contributing it can be the best solution for your projects. Consequently, you gain a powerful, secure, and easy-to-manage observability solution that can help you effectively monitor, troubleshoot, and optimize your systems.

--

--