Going Serverless with Databricks: part 2

Published in

Marvelous MLOps

5 min readMay 2, 2023

In part 1 we explained how serverless api deployment works using a Databricks notebook. In this article we will explain how one can bring an api to production using Azure Databricks and GitHub Actions. Code can be found in our repository: https://github.com/marvelousmlops/amazon-reviews-databricks.

Getting started: setup

Step 1: get SPN with Databricks admin permissions

First of all, you need to have a service principal that has admin permissions on Databricks and generate a token for it. This service principal is used to run the Databricks job and deploy the api.

This documentation explains how to create a service principal with admin permissions: https://docs.databricks.com/dev-tools/service-principals.html#language-curl.

Step 2: generate Databricks token for SPN

Note: /api/2.0/token-management/on-behalf-of/tokens is not enabled for Azure Databricks. A workaround (given that you have SPN with Databricks admin permissions and its client id, client secret and tenant):

# Code for retrieve_databricks_token can be found in the repository
from amazon_reviews.utils.databricks_token import retrieve_databricks_token
import os
os.environ['DATABRICKS_HOST'] = <DATABRICKS_HOST>
token = retrieve_databricks_token(client_id='', client_secret='', tenant='')

DATABRICKS_HOST value should have format: https://< your workspace >/, for example: https://adb-123456781234.2.azuredatabricks.net/

Step 3: create KeyVault backed secret scope and add Databricks token to KeyVault

DatabricksToken is referred in line 66 of both category_model/category_model.json.j2 and recommender/recommender.json.j2 as {{secrets/keyvault/DatabricksToken}}.

Create KeyVault backed secret scope called "keyvault" or, if such scope already exists and has a different name, update line 66 in both category_model/category_model.json.j2 and recommender/recommender.json.j2.

This documentation explains how to create a secret scope: https://learn.microsoft.com/en-us/azure/databricks/security/secrets/secret-scopes.

Add secret called DatabrickToken to KeyVault as explained here: https://learn.microsoft.com/en-us/azure/key-vault/secrets/quick-create-cli.

Step 4: Setup repository

Fork https://github.com/marvelousmlops/amazon-reviews-databricks. In the forked repository that you now own, create repository secrets called DATABRICKS_HOST and DATABRICKS_TOKEN.

GitHub Actions workflow

GitHub Actions workflow called train_and_deploy_models.yml consists of 4 main steps:

build a wheel for amazon-reviews package
copy the wheel and necessary python files to dbfs
replace GIT_SHA and DATABRICKS_HOST values in Jinja templates category_model/category_model.json.j2 and recommender/recommender.json.j2 by actual values
deploy/update Databricks jobs for category model and recommender; trigger job run

This is the minimal workflow that is intended to show how deployment can work in production, but it is not scalable and not reusable. We encourage to build reusable workflows for machine learning model deployments and plan to write about our experience with reusable workflows in near future.

Good to note that train_and_deploy_models.yml triggers job run in our example; in actual production setup it is more likely that GitHub Actions workflow is used to update Databricks job definitions, and job definitions contain schedule to retrain and deploy models on regular basis.

Also, in a real world situation code for category model and recommender model would likely be stored in different GitHub repositories, or there would be at least 2 different workflows for deployment of these models.

Databricks job definition

Let's take recommender/recommender.json.j2 as an example. This is a multitask Databricks job that trains model using recommender/train_recommender.py script first, and then deploys trained model using recommender/deploy_recommender.py. Both of these steps use the same job cluster called recommender_cluster.

To ensure traceability, we pass run_id and job_id as python parameters to both python scripts. Find more information on it in our traceability and reproducibility article.

import os
import argparse
import mlflow

def get_arguments():
    parser = argparse.ArgumentParser(description='reads default arguments')
    parser.add_argument('--run_id', metavar='run_id', type=str)
    parser.add_argument('--job_id', metavar='job_id', type=str)
    args = parser.parse_args()
    return args.run_id, args.job_id

git_sha = os.environ['GIT_SHA']
run_id, job_id = get_arguments()

mlflow.set_experiment(experiment_name='/Shared/Amazon_recommender')
with mlflow.start_run(run_name="amazon-recommender") as run:
    mlflow_run_id = run.info.run_id
    tags = {
            "GIT_SHA": git_sha,
            "MLFLOW_RUN_ID": mlflow_run_id,
            "DBR_JOB_ID": job_id,
            "DBR_RUN_ID": run_id,
        }

After model is trained and logged, we register and tag it using the tags defined above:

mlflow.register_model(model_uri=f"runs:/{mlflow_run_id}/model",
                      name='amazon-recommender',
                      tags=tags)

Using tags when registering models makes it easy to search for the model version that belongs to the parent run id of our Databricks job:

from amazon_reviews.api_deployment.databricks_api import serve_ml_model_endpoint
from mlflow.tracking.client import MlflowClient
import argparse

def get_arguments():
    parser = argparse.ArgumentParser(description='reads default arguments')
    parser.add_argument('--run_id', metavar='run_id', type=str)
    parser.add_argument('--job_id', metavar='job_id', type=str)
    args = parser.parse_args()
    return args.run_id, args.job_id

run_id, job_id = get_arguments()

client = MlflowClient()
model_version = client.search_model_versions(f"name='amazon-recommender' and tag.DBR_RUN_ID = '{run_id}'")[0].version

client.transition_model_version_stage(name='amazon-recommender',
                                      version=model_version,
                                      stage='Production')

config = {
    "served_models": [{
        "model_name": "amazon-recommender",
        "model_version": f"{model_version}",
        "workload_size": "Small",
        "scale_to_zero_enabled": False,
    }]
    }

serve_ml_model_endpoint(endpoint_name='amazon-recommender',
                        endpoint_config=config)

How to call the endpoint?

To call the endpoint, you will need a Databricks token again. However, for security reasons that token needs to have very limited permissions: it only should be able to call the api.

For that purpose we recommend to create another service principal without admin permissions (see step 1 Getting started), create a dedicated group on Databricks without any entitlements (in our case, called "api"), add service principal to the group.

Important! Grant permissions to the group to use tokens: https://learn.microsoft.com/en-us/azure/databricks/administration-guide/access-control/tokens#manage-token-permissions-using-the-permissions-api

In serving endpoint permissions, add query permissions to api for group "api" and generate token as described in step 2. Store token as environment variable called DATABRICKS_TOKEN.

Now you can successfully call the endpoint!

import requests
import os

model_input = {
    'customer_id': 'abcdefg12345678abcdefg',
    'basket': ['B00000J0FW'] # Sassy Who Loves Baby Photo Book
}

url = f"https://{os.environ['DATABRICKS_HOST']}/serving-endpoints/amazon-recommender/invocations"


headers = {'Authorization': f"Bearer {os.environ['DATABRICKS_TOKEN']}",
           'Content-Type': 'application/json'}
data_json = {'inputs': model_input}
response = requests.request(method='POST', headers=headers, url=url, json=data_json)


print(response.text)
# {"predictions": ["B00005MKYI", "B000056HMY", "B000056JHM", "B000046S2U", "B000056JEG", "B000099Z9K", "B00032G1S0", "B00005V6C8", "B0000936M4"]}

# B00005MKYI: Deluxe Music In Motion Developmental Mobile, B000056HMY: The First Years Nature Sensations Lullaby Player; B000056JHM:Bebe Sounds prenatal gift set