Saving 75% with a weekday Cloud Composer development environment
Google Cloud Platform provides some tools that let us manage our data. We mainly use Cloud Composer — GCP’s managed Airflow service — to schedule some of our data pipelines with all of our data ending up in GCP’s data warehouse BigQuery for analysis.
We previously had development DAGs next to our production DAGs in our single Composer instance. This made cloud environment separation a bit undesirable as we were checking filenames for _dev
when setting environment variables. At the beginning of this year we made the decision to streamline our DAG development by having separate development and live Composer instances.
Currently, in April 2019, you can’t turn on and off a Composer environment; you can only create or destroy. To minimise costs and overhead, we went down the route of having a development Composer instance spinning up and down every weekday between 9:45am and 6:15pm being scheduled by our production Composer instance.
Here’s what our spin up DAG looks like:
import datetime as dt
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
OWNER = 'jlaird'
DAG_NAME = 'composer-dev-environment-setup'
default_args = {
'owner': OWNER,
'depends_on_past': False,
'start_date': dt.datetime(2019, 1, 10),
'retries': 3,
'retry_delay': dt.timedelta(minutes=5),
}
dag = DAG(DAG_NAME,
default_args=default_args,
description='Spin up the development Composer environment',
schedule_interval='45 9 * * 1-5',
catchup=False)
# BashOperators are run from a temporary location so a cd is necessary to ensure relative paths in the Bash scripts aren't broken
bash_command = """
cd ~/gcs/dags/scripts/composer
chmod u+x composer_setup.sh
./composer_setup.sh dev
"""
# Note that there is a space at the end of the above command after dev due to: https://airflow.apache.org/howto/operator.html#jinja-template-not-found
BashOperator(
task_id='composer_setup',
dag=dag,
bash_command=bash_command,
)
As installing the PyPi packages takes a while, we spin up at 9:45 which causes the environment to be ready shortly after 10.
The bash file composer_setup.sh referenced looks like:
#!/usr/bin/env bash
set -exo pipefail
source composer_settings.sh ${1}
set -u
# beta must be used to provide --airflow-version
gcloud beta composer environments create ${COMPOSER_NAME} \
--location=${COMPOSER_LOCATION} \
--airflow-configs=core-dags_are_paused_at_creation=True \
--image-version=composer-1.5.0-airflow-1.10.1 \
--disk-size=20GB \
--python-version=2 \
--node-count=3 \
--labels env=${ENVIRONMENT}
COMPOSER_GCS_BUCKET_DATA_FOLDER=$(gcloud composer environments describe ${COMPOSER_NAME} --location ${COMPOSER_LOCATION} | grep 'dagGcsPrefix' | grep -Eo "\S+/")data
echo "Data folder is ${COMPOSER_GCS_BUCKET_DATA_FOLDER}"
# Copy environment's variables file and service account credentials from our analytics GCS bucket
gsutil cp ${ENV_VARIABLES_JSON_GCS_LOCATION} ${COMPOSER_GCS_BUCKET_DATA_FOLDER}
gsutil cp ${CREDENTIALS_JSON_LOCATION} ${COMPOSER_GCS_BUCKET_DATA_FOLDER}
echo "Importing environment variables from ${COMPOSER_INSTANCE_DATA_FOLDER}/${ENV_VARIABLES_JSON_NAME}..."
# Import environment's variables file
gcloud composer environments run ${COMPOSER_NAME} \
--location ${COMPOSER_LOCATION} variables -- \
-i ${COMPOSER_INSTANCE_DATA_FOLDER}/${ENV_VARIABLES_JSON_NAME}
echo "Importing pypi packages from ./pypi_packages..."
# Install PyPi packages from file
gcloud composer environments update ${COMPOSER_NAME} \
--location ${COMPOSER_LOCATION} \
--update-pypi-packages-from-file=pypi_packagesecho "Setting up bigquery_gdrive connection"
gcloud composer environments run ${ENVIRONMENT} \
--location ${COMPOSER_LOCATION} connections -- --add \
--conn_id=bigquery_gdrive --conn_type=google_cloud_platform \
--conn_extra <<EXTRA '{"extra__google_cloud_platform__project": "our-project",
"extra__google_cloud_platform__key_path": "/home/airflow/gcs/data/service-account.json",
"extra__google_cloud_platform__scope": "https://www.googleapis.com/auth/bigquery,https://www.googleapis.com/auth/drive,https://www.googleapis.com/auth/cloud-platform"}'
EXTRAecho "Done"
This imports the PyPi packages and copies over the environment variables from our GCS bucket. We also modify the airflow config with core-dags_are_paused_at_creation=True
so that we need to explicitly turn DAGs on and stop them from backfilling automatically.
We also create a new connection bigquery_gdrive
because we want our Composer to be able to interact with CSVs hosted on Google Drive.
The composer_settings.sh referenced:
#!/usr/bin/env bash
ENVIRONMENT=$1
case ${ENVIRONMENT} in
dev|live1)
COMPOSER_NAME=${ENVIRONMENT}
;;
*)
echo "usage: ./composer_setup.sh {dev|live1}" 1>&2
exit 99
;;
esac
COMPOSER_INSTANCE_DATA_FOLDER=/home/airflow/gcs/data
COMPOSER_LOCATION=europe-west1echo "Operating on environment ${COMPOSER_NAME}" 1>&2ENV_VARIABLES_JSON_NAME=airflow_env_variables_${ENVIRONMENT}.json
ENV_VARIABLES_JSON_GCS_LOCATION=gs://analytics/credentials/${ENV_VARIABLES_JSON_NAME}
CREDENTIALS_JSON_NAME=service-account.json
CREDENTIALS_JSON_LOCATION=gs://analytics/credentials/${CREDENTIALS_JSON_NAME}
When we want to keep the development environment on overnight for scheduling purposes, we can simply switch the tear down/up DAGs off. When we switch the DAGs back on, we wouldn’t want them to run multiple times for the days missed; using catchup=False
in both DAGs prevents this.
Note: catchup=False started working again in Composer version composer-1.8.2-airflow-1.10.3
Our teardown DAG:
import datetime as dt
from airflow import DAG
from airflow.operators.bash_operator import BashOperator
OWNER = 'jlaird'
DAG_NAME = 'composer-dev-environment-teardown'
default_args = {
'owner': OWNER,
'depends_on_past': False,
'start_date': dt.datetime(2019, 1, 10),
'retries': 3,
'retry_delay': dt.timedelta(minutes=5),
}
dag = DAG(DAG_NAME,
default_args=default_args,
description='Tear down the development Composer environment',
schedule_interval='15 18 * * 1-5',
catchup=False)
# BashOperators are run from a temporary location so a cd is necessary to ensure relative paths in the Bash scripts aren't broken
bash_command = """
cd ~/gcs/dags/scripts/composer
chmod u+x composer_teardown.sh
./composer_teardown.sh dev
"""
# Note that there is a space at the end of the above command due to: https://airflow.apache.org/howto/operator.html#jinja-template-not-found
BashOperator(
task_id='composer_teardown',
dag=dag,
bash_command=bash_command,
)
composer_teardown.sh:
#!/usr/bin/env bash
set -exo pipefail
source composer_settings.sh ${1}
set -u
COMPOSER_GCS_BUCKET=$(gcloud composer environments describe ${COMPOSER_NAME} --location europe-west1 | grep 'dagGcsPrefix' | grep -Eo "\S+/")
COMPOSER_GCS_BUCKET_DATA_FOLDER=${COMPOSER_GCS_BUCKET}data
# Export variables file to composer instance
gcloud composer environments run ${COMPOSER_NAME} \
--location ${COMPOSER_LOCATION} variables -- \
-e ${COMPOSER_INSTANCE_DATA_FOLDER}/${ENV_VARIABLES_JSON_NAME} \
# Overwrite saved environment's variables file in analytics GCS bucket
gsutil cp ${COMPOSER_GCS_BUCKET_DATA_FOLDER}/${ENV_VARIABLES_JSON_NAME} ${ENV_VARIABLES_JSON_GCS_LOCATION}
gcloud composer environments delete ${COMPOSER_NAME} \
--location=${COMPOSER_LOCATION} \
--quiet
# Remove GCS bucket as it doesn't get cleaned up when the Composer instance gets deleted
gsutil -m rm -r ${COMPOSER_GCS_BUCKET}
This exports our variables to the same GCS bucket as above. We went with the decision to export variables from the dev
environment when spun down but some may prefer for the dev
variables to be ephemeral.
Bear in mind that due to the Composer instance being destroyed, you won’t be able to recover the logs via the Airflow UI. However, Composer has built in Stackdriver integration so you will have logs stored for up to 30 days afterwards!
Before having this development environment, our Cloud Composer costs were ~£250/month. By doing this method, we only see an increase of ~£60 to our monthly Cloud Composer costs. That makes sense that by having a development environment up for 25% of the week, it only increases the monthly cost by 25%.
By spinning the environment up and down, that’s ~£190/month saved that the business has to not invest in Bitcoin! Bargain.