Mastering Azure Deployment for Streamlit Apps: Your Definitive Guide

Henkel Data & Analytics
Henkel Data & Analytics Blog
12 min readMay 28, 2024

--

By Marina Gateva and Roberto Alonso.

Introduction

In this article, we explore how we empowered data scientists at Henkel and simplified their journey to quickly deploy compliant Streamlit apps on Azure. We streamlined most of the deployment steps, from authenticating the app and creating containers to linting and formatting the code. This allows them to spend more time on the core of the problem and get user feedback sooner.

Motivation

In data science, Streamlit apps play a critical role within organizations to quickly harness the power of data. As a data scientist, having your data science code, your frontend, and backend in a single programming language makes the entry curve for developing user-facing apps smoother. Unlike more complex applications where frontend and backend communicate with each other via REST APIs, the simple, Python-based development workflow of Streamlit apps provides a greatly accelerated path to generating value from data.

However, the journey to creating a production-ready Streamlit app in an enterprise environment is challenging. Even as a Proof of Concept (PoC), these apps must be secure, especially in industries that deal with proprietary data. For example, the app must be behind an authentication layer that ensures that only authorized people can access it. Because cybersecurity topics are often outside the area of expertise of the data scientist, Streamlit apps may not be deployed in many cases, or worse, be deployed without proper access control measures, putting corporate data at risk.

In this article, we provide insights into how to deploy Streamlit applications in a compliant manner in a production environment on Azure, regardless of the maturity of the project. We start by deploying the infrastructure itself, which provides an authentication layer based on MS Entra ID to protect the data and insights. Next, an app service is created and configured to ensure that no secrets are exposed during the deployment process. The Streamlit application is then deployed as a container running Python packages that have no reported vulnerabilities.

Understanding the infrastructure

To deploy a Steamlit-based web application, we decided to use Azure App Service, which is a fully managed platform-as-a-service. This means that Microsoft handles all the infrastructure management, patching, and maintenance tasks, allowing us to focus only on building and deploying the application in an automated way. By simply connecting Application Insights and Log Analytics to Azure App Service, we can get out-of-the-box monitoring and diagnostics for our web application

Deployment options

There are several deployment options on Azure App Service depending on the deployment source. Even though there is an option to connect the App Service directly to your source code repository in GitHub or Azure DevOps, we decided to use Docker containers for more control over the build process. This means our deployment source is an Azure Container Registry.

You have two options for authentication when establishing a connection to the Azure Container Registry:

  • Admin credentials
  • Managed identity

From a security perspective we prefer using a system-assigned managed identity. A managed identity eliminates the need to manage and store credentials manually, as authentication is handled securely by Microsoft Entra ID and allows us to follow the principle of least privilege.

The App Service’s managed identity must have AcrPull permissions on the container registry. In case you are unfamiliar with this, you can follow a tutorial here: How to use user-assigned Managed Identities with App Service and Azure Container Registry through the Azure Portal

Authentication

Azure App Service offers a built-in authentication feature, which is also known as “Easy Auth”. Easy Auth handles the authentication process, including user sign-in and token issuance, without requiring additional code in your app. We use this option with the Microsoft Entra ID as an authentication provider.

Easy Auth leverages Microsoft Entra ID App Registrations to authenticate users and authorize access to resources in applications deployed on Azure App Service. When you enable Easy Auth with Microsoft Entra ID, Easy Auth prompts you to specify the app registration to use for authentication. You can specify an existing one or have the Easy Auth process create and configure a new app registry for you.

Common pitfall:

When you want to fully automate the provisioning of the App Service instead of using the Azure portal, e.g. using terraform, you need to create the app registration separately. In this case you also need to take care of its proper configuration. There are two main configurations on the app registration to be considered:

Add a redirect URI to the app registration

Redirect URIs (also known as reply URLs) are URLs to which the identity provider redirects users after they have authenticated. These URIs are critical in the OAuth 2.0 and OpenID Connect authentication flows, as they specify where the authentication responses should be sent. The redirect URIs must be precise and match the URIs used by your application. Any discrepancy can cause authentication to fail.

To add a redirect URI for your app, you can open the app registration in Microsoft Entra ID, navigate to Authentication -> Add a platform and select Web.

The format of the callback URI in the context of Azure App Service’s Easy Auth, typically follows this pattern:

https://NAME_OF_YOUR_APP_SERVICE.azurewebsites.net/.auth/login/PROVIDER_NAME/callback

Add the Microsoft Graph User.Read API permission to the app registration

The User.Read permission with a delegated scope type is used by apps that are signed in as a user. The application can access the user’s data on behalf of the signed-in user. The User.Read permission is typically requested in this context. Depending on your setup, it could be that you need to request a consent grant from your Active Directory Administrator.

App settings

Roughly speaking, the App Settings in App Services are environment variables that are injected into the container during deployment. Any secrets and configurations that depend on your Streamlit app’s environment variables must be specified here. Our recommendation is to specify two main settings: The port that will be exposed in the container and secrets such as API keys.

Ports

By default, your App Service exposes port 80, if a different port is required then the WEBSITES_PORT app setting must be set (Configure a custom container — Azure App Service | Microsoft Learn). Typically, a Streamlit app will be exposed using port 8501. While this port can be changed, we would like to continue to use this as it has been our experience that it also makes the development experience smoother for new joiners.

Common pitfall:

If you don’t specify the WEBSITES_PORT, your container will keep crashing as it tries to deploy the app exposing port 80. Sometimes after 10 minutes the App Service automatically detects the exposed port. Our recommendation is to set this setting explicitly.

Secrets

You need to be especially careful with secrets because all application settings are visible in plain text in the configuration section of an App Service on the Azure Portal. Even though only people with assigned access can see them, from a security perspective, it is better to store all kinds of secrets in Azure Key Vault.

App Service provides a convenient way to connect to your KeyVault. This way, during deployment, it will read the secrets from the KeyVault and inject them into the container. This gives you better control over your secrets. There is a good reference in Microsoft’s documentation on how to do this (Use Key Vault references — Azure App Service | Microsoft Learn).

In addition to the security benefit, there is another benefit we take advantage of. We have a central and secure place to rotate secrets across multiple applications in the same project. For example, imagine for the same project in your organization, you have multiple Streamlit apps using the same LLM-as-a-service API. If the API token expires, you would have to change the app’s settings in multiple locations. By keeping the secrets in a project-specific KeyVault, you only need to change them in one place.

CI/CD Automation

In this section, we show how to integrate the Streamit deployment with CI/CD automation.

Dockerfile

To increase the security of your Streamlit apps, you can build your own base image based on the official Python image. We see two main benefits to this approach:

1. You can add corporate certificates so that all Streamlit apps can authenticate to on-premises servers as necessary.

2. You can rebuild the base image (e.g., daily) to install the latest security OS and Python patches.

Example of a base image:

# Use a lightweight base image with Python 3.10
FROM python:3.10.12-slim-bullseye
# Upgrade pip
RUN pip install --upgrade pip
# Update all the OS packages and remove unnecessary downloaded packages
RUN apt-get update && apt-get upgrade -y && apt-get autoremove -y && apt-get clean
# Update all the default Python packages
RUN pip --disable-pip-version-check list --outdated --format=json | \
python -c "import json, sys; print('\n'.join([x['name'] for x in json.load(sys.stdin)]))"
# Install any other libraries/drivers required. E.g., the JDBC MSSQL driver
# Install all your corporate certificates required for all your corporate apps

All Streamlit apps can then be based on this base image. This way, each time the project-specific Streamlit container is rebuilt, the latest security patches are installed.

Example of a Streamlit app based on the custom Python base image:

# Use your custom Python based image in your ACR
FROM yourazurecontainerregistry.azurecr.io/base/py310:latest
# Set environment variables
ENV PIP_NO_CACHE_DIR=1 \
PYTHONUNBUFFERED=1
# Install necessary packages and create non-root user
RUN groupadd -r streamlit && useradd --no-log-init -r -g streamlit streamlit
# Set the working directory
WORKDIR /usr/app
# Copy the requirements file first to leverage Docker cache
COPY requirements.txt .
# Install dependencies
RUN pip install -r requirements.txt
# Copy the rest of the application code
COPY src .
# Switch to the non-root user
USER streamlit
# Expose port 8501
EXPOSE 8501
# Command to run the Streamlit app
CMD ["streamlit", "run", "src/app.py", "--server.port=8501"]

Deployment Automation

The deployment automation is done in an Azure DevOps pipeline. For full end-to-end automation, we use Terraform alongside the Azure CLI. This combination of infrastructure-as-code and application deployment allows us to establish a continuous deployment flow, enabling the creation and recreation of the entire setup from scratch within minutes.

Create Infrastructure as Code with Terraform

To provision resources with Terraform, we start by creating a resource group, as well as Azure Application Insights and a Log Analytics workspace.

resource "azurerm_resource_group" "web" {
name = local.resource_group_name
location = var.location
tags = var.tags
}

resource "azurerm_log_analytics_workspace" "web" {
name = var.project_name
location = var.location
resource_group_name = local.resource_group_name
sku = "PerGB2018"
retention_in_days = "30"
tags = var.tags


resource "azurerm_application_insights" "web" {
name = var.project_name
location = var.location
resource_group_name = local.resource_group_name
workspace_id = azurerm_log_analytics_workspace.web.id
application_type = "web"
tags = var.tags
}

Before we proceed with creating the App Service, it is essential to first establish an App Service Plan, which defines the underlying compute resources and pricing tier for your web app.

resource "azurerm_service_plan" "web" {
name = var.project_name
resource_group_name = local.resource_group_name
location = var.location
os_type = "Linux"
sky_name = var.sku_name
tags = var.tags
}

To provision the app service, we utilize Terraform’s azurerm_linux_web_app resource. In the configuration shown below, we link it to the service plan using its ID. The connection to Application Insights is made using the instrumentation_key and connection_string specified as environment variables in the application settings.

The Easy Auth configuration is handled within the auth_settings block, where we specify the identity provider and the client ID of the app registration. Note that the app registration ID is provided as a variable, it is not configured directly in Terraform. This step is done later in the Azure DevOps pipeline using the Azure CLI.

resource "azurerm_linux_web_app" "web" {
name = var.project_name
location = var.location
resource_group_name = local.resource_group_name
service_plan_id = azurerm_service_plan.web.id
https_only = true
app_settings = {
"APPINSIGHTS_INSTRUMENTATIONKEY" = azurerm_application_insights.application_insights.instrumentation_key
"APPLICATIONINSIGHTS_CONNECTION_STRING" = azurerm_application_insights.application_insights.connection_string
}
site_config {
always_on = true
ftps_state = "FtpsOnly"
}
auth_settings {
enabled = true
default_provider = "AzureActiveDirectory"
issuer = "https://sts.windows.net/${data.azurerm_subscription.primary.tenant_id}/v2.0"
active_directory {
client_id = var.client_id
}
}
tags = var.tags
}

Implement CI/CD pipeline on Azure DevOps

A prerequisite for the deployment pipeline is that you have an Azure Resource Manager type service connection in Azure DevOps, with Contributor permissions on your Azure Subscription. You also need an existing Azure Storage Account that will serve as the backend for Terraform.

In addition, we assume that the app registration we will use for Easy Auth already exists, and we pass its client ID as a parameter to the pipeline. We are also using an existing Azure Container Registry and not creating it as part of this automation. To establish the connection to the container registry from your Azure DevOps project, you will need to configure a service connection of type Docker registry. To learn more about creating service connections, follow the documentation from Microsoft: Service connections in Azure Pipelines — Azure Pipelines | Microsoft Learn

The pipeline itself has three main stages. In the first stage we initialize and execute Terraform to create the necessary infrastructure.

- task:  ms-devlabs.custom-terraform-tasks.custom-terraform-installer-task.TerraformInstaller@0
displayName: Install Terraform
inputs:
terraformVersion: 'latest'

- task: TerraformCLI@0
displayName: Terraform init
inputs:
command: 'init'
backendType: 'azurerm'
workingDirectory: '$(deployTerraformWorkingDirectory)'
backendServiceArm: '$(serviceConnection)'
backendAzureRmSubscriptionId: $(tfBackendSubscriptionId)
allowTelemetryCollection: false
env:
TF_DATA_DIR: $(deployTerraformWorkingDirectory)/.terraform
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)

- task: TerraformCLI@0
displayName: Terraform plan
inputs:
command: 'plan'
backendType: 'azurerm'
workingDirectory: '$(deployTerraformWorkingDirectory)'
environmentServiceName: '$(serviceConnection)'
providerAzureRmSubscriptionId: $(tfBackendSubscriptionId)
commandOptions: '-var-file=$(environmentName).tfvars'
publishPlanResults: $(appName)_$(environmentName)
allowTelemetryCollection: false
env:
TF_VAR_tags: '{dsManagedBy="Terraform", dsBuildNumber="$(Build.BuildNumber)"'
TF_DATA_DIR: $(deployTerraformWorkingDirectory)/.terraform
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)

- task: TerraformCLI@0
displayName: Terraform apply
name: terraform
inputs:
command: 'apply'
backendType: 'azurerm'
workingDirectory: '$(deployTerraformWorkingDirectory)'
environmentServiceName: '$(serviceConnection)'
providerAzureRmSubscriptionId: $(tfBackendSubscriptionId)
commandOptions: '-var-file=$(environmentName).tfvars'
allowTelemetryCollection: false
env:
TF_VAR_tags: '{dsManagedBy="Terraform", dsBuildNumber="$(Build.BuildNumber)"'
TF_DATA_DIR: $(deployTerraformWorkingDirectory)/.terraform
ARM_CLIENT_ID: $(ARM_CLIENT_ID)
ARM_CLIENT_SECRET: $(ARM_CLIENT_SECRET)
ARM_SUBSCRIPTION_ID: $(ARM_SUBSCRIPTION_ID)
ARM_TENANT_ID: $(ARM_TENANT_ID)

In the second stage, we use the Docker pipeline task to build and push the container image with the Streamlit application code to Azure Container Registry.

- task: Docker@2
inputs:
containerRegistry: '$(registryName)'
repository: '$(appName)'
command: 'buildAndPush'
Dockerfile: '$(System.DefaultWorkingDirectory)/**/Dockerfile'
tags: |
latest
displayName: Build container

Before we deploy the container image to the App Service in the third stage, we need to configure the redirect URIs on the app registration. For this we use the Azure CLI to add the callback URI of the new web application as previously described.

- task: AzureCLI@2
name: update_sp_redirect_uri
displayName: Update redirect URIs
inputs:
azureSubscription: $(serviceConnection)
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
az ad app update --id $(authClientID) \
--web-redirect-uris https://$(appName).azurewebsites.net/.auth/login/aad/callback

For the actual deployment, we again use the Azure CLI to configure the Azure App Service to use the newly pushed Docker image.

- task: AzureCLI@2
name: deploy_app_service
displayName: Deploy App Service
inputs:
azureSubscription: $(serviceConnection)
scriptType: bash
scriptLocation: inlineScript
inlineScript: |
az webapp config container set \
-g $(resourceGroupName) \
-n $(appName) \
-i $(containerRegistryURL)/$(appName):$(BuildId.BuildId)

Conclusion

In this article, we provided guidelines on deploying a Streamlit app on Azure using Azure App Service and demonstrated how to automate the provisioning and deployment process to easily create apps from scratch. This level of automation is particularly useful for creating a self-service model for launching new applications. Such a self-service approach is suitable when data scientists are conducting multiple PoCs that require a custom web interface for the user, offering scalability. In addition, this automated approach ensures that tasks are performed consistently every time, reducing the risk of human error.

While a managed serverless platform like Azure App Service greatly simplifies the deployment of web applications, there are still challenges, especially when striving for full automation of the setup. Here, we shared various tips and insights based on our experiences with common pitfalls.

In the second part of this article, we would like to share some best practices around DevOps for Streamlit apps. For example, we suggest using Bandit for security scanning of your Python code, Trufflehog to detect leaked credentials or Ruff for linting and formatting.

Whether shampoo, detergent, or industrial adhesive — Henkel stands for strong brands, innovations, and technologies. In our data science, engineering, and analytics teams we solve modern data challenges for the benefit of our customers.
Learn more at
henkel.com/digitalization.

--

--

Henkel Data & Analytics
Henkel Data & Analytics Blog

Find out how Henkel creates its next digital innovations and tech driven business solutions based on data & analytics. henkel.com/digitalization