Start Databricks cluster from terraform

Peter Herbel
GetTech Blog
Published in
1 min readSep 11, 2024

In data engineering, managing infrastructure efficiently is crucial. Terraform helps automate this process.

In this blog, we’ll discuss the benefits of starting a Databricks cluster during Terraform deployment, especially for integration tests. This automation saves time, reduces manual work, and ensures consistent testing environments.

1. Need a shell script with Databricks CLI command.


  1. You need jq in your CI/CD agent -> jq (
  2. You need Databricks CLI in your CI/CD agent -> What is the Databricks CLI?
  3. Need to configure service principal authentication.

Set these environment variables, I hope you are familiar with those.


The same environment variables can be used as for terraform.


set -e

export DATABRICKS_HOST=$databricks_host

clusterState=$(databricks clusters get $databricks_cluster_id | jq -r '.state')
if [[ $clusterState == "TERMINATED" ]]; then
echo "Starting Databricks cluster $databricks_cluster_id..."
databricks clusters start $databricks_cluster_id
elif [[ $clusterState == "RUNNING" ]]; then
echo "Databricks cluster $databricks_cluster_id is already running"
echo "Databricks cluster $databricks_cluster_id is in state $clusterState. Aborting..."
exit 1

Also, you need Databricks host and cluster id. Those values come from terraform code.

2. Terraform code

We call the shell script with the necessary parameters from terraform.

resource "terraform_data" "databricks_cluster_start" {
triggers_replace = [

provisioner "local-exec" {
command = "bash ${azurerm_databricks_workspace.your_databricks_workspace.workspace_url} ${}"

So, when your Databricks infrastructure is deployed your cluster will be started and you can run your tests.



Peter Herbel
GetTech Blog

Architect, leader, coach, help teams to understand technologies, DevOps and Agile software principles and practices, focusing on cloud systems