Start Databricks cluster from terraform

Peter Herbel
GetTech Blog
Published in
1 min readSep 11, 2024

In data engineering, managing infrastructure efficiently is crucial. Terraform helps automate this process.

In this blog, we’ll discuss the benefits of starting a Databricks cluster during Terraform deployment, especially for integration tests. This automation saves time, reduces manual work, and ensures consistent testing environments.

1. Need a shell script with Databricks CLI command.

Prerequisite:

  1. You need jq in your CI/CD agent -> jq (jqlang.github.io)
  2. You need Databricks CLI in your CI/CD agent -> What is the Databricks CLI?
  3. Need to configure service principal authentication.

Set these environment variables, I hope you are familiar with those.

ARM_CLIENT_ID
ARM_TENANT_ID
ARM_CLIENT_SECRET

The same environment variables can be used as for terraform.

#!/bin/bash

set -e

databricks_host=$1
databricks_cluster_id=$2
export DATABRICKS_HOST=$databricks_host

clusterState=$(databricks clusters get $databricks_cluster_id | jq -r '.state')
if [[ $clusterState == "TERMINATED" ]]; then
echo "Starting Databricks cluster $databricks_cluster_id..."
databricks clusters start $databricks_cluster_id
elif [[ $clusterState == "RUNNING" ]]; then
echo "Databricks cluster $databricks_cluster_id is already running"
else
echo "Databricks cluster $databricks_cluster_id is in state $clusterState. Aborting..."
exit 1
fi

Also, you need Databricks host and cluster id. Those values come from terraform code.

2. Terraform code

We call the shell script with the necessary parameters from terraform.

resource "terraform_data" "databricks_cluster_start" {
triggers_replace = [
timestamp()
]

provisioner "local-exec" {
command = "bash databricks-cluster-start.sh ${azurerm_databricks_workspace.your_databricks_workspace.workspace_url} ${databricks_cluster.your_databricks_cluster.id}"
}
}

So, when your Databricks infrastructure is deployed your cluster will be started and you can run your tests.

--

--

Peter Herbel
GetTech Blog

Architect, leader, coach, help teams to understand technologies, DevOps and Agile software principles and practices, focusing on cloud systems