Execute Cloud Build and Cloud Run with Github Actions

Sasakky
3 min readMay 8, 2022

--

What’s this

The data pipeline created in Building a Robust Data Pipeline with Great Expectations, dbt and airflow can be deployed to GCP automatically using Github Actions, so I summarized the research and code at that time.

Assumption

Normally, if you want to deploy Airflow, you would use Cloud Composer or deploy with GKE, but since the servers are resident and costly, and since this is just for my own practice, I will dare to deploy it all in one container.

Requirement

  • Triggered by a push to the main branch, Github Actions automatically deploys the container on GCP.
  • Once deployed, tasks are scheduled by Airflow to perform periodic data validation and loading.
  • Containers are deployed using CloudBuild and CloudRun by reading the Dockerfile in the repository.
  • Service account keys required for deployment and for Airflow to handle resources in GCP are stored and passed in GithubActions secrets.

Directory Structure

.
├── prod # files for deploy
│ ├── Dockerfile.prod
│ ├── airflow.conf
│ └── cloudbuild.yaml
└── scr # source code
├── dag
├── dbt
└── great_expectations

Develop

Add following code in dockerfile.prod

ARG GCP_AIRFLOW_SA_KEY
RUN echo -n ${GCP_AIRFLOW_SA_KEY} | base64 --decode > /tmp/gcp_secret.json

cloudbuild.yaml:

steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/***/prod_data_pipeline', '--build-arg', 'GCP_AIRFLOW_SA_KEY=${_GCP_AIRFLOW_SA_KEY}', '-f', './prod/Dockerfile.prod', '.']
substitutions:
_GCP_AIRFLOW_SA_KEY: foobar
options:
substitution_option: 'ALLOW_LOOSE'

ci.yaml:

on:
push:
branches:
- main
name: Build and Deploy a Container
env:
PROJECT_ID: ${{ secrets.GCP_PROJECT }}
GCP_SA_KEY: ${{ secrets.GCP_SA_KEY }}
GCP_AIRFLOW_SA_KEY: ${{ secrets.GCP_AIRFLOW_SA_KEY }}
SERVICE: datapipeline
REGION: us-central1
IMAGE: gcr.io/${{ secrets.GCP_PROJECT }}/prod_data_pipeline
PORT: 8080
MEMORY: 2G
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
- name: Setup Cloud SDK
uses: google-github-actions/setup-gcloud@v0.2.0
with:
project_id: ${{ env.PROJECT_ID }}
service_account_key: ${{ secrets.GCP_SA_KEY }}
export_default_credentials: true

- name: Configure Docker
run: |
gcloud auth configure-docker
- name: Deploy to Cloud Build
run: |
gcloud builds submit --config ./prod/cloudbuild.yaml --substitutions=_GCP_AIRFLOW_SA_KEY=${{ env.GCP_AIRFLOW_SA_KEY }}
- name: Deploy to Cloud Run
run: |
gcloud run deploy ${{ env.SERVICE }} --project ${{ env.PROJECT_ID }} --region ${{ env.REGION }} --image ${{ env.IMAGE }} --port ${{ env.PORT }} --memory ${{ env.MEMORY }}

As a complicated configuration, it is necessary to specify the Key Json Path in the connection of Airflow, and I was not sure how to pass the contents of the key JSON registered in the secrets of Github Actions to the Dockerfile, so I followed the flow as follows.
(1) Register the encoded string in the secrets
(2) Inject ① as substitutions when executing cloudbuild.yaml in ci.yaml.
(3) Pass as — build-arg to dockerfile in cloudbuild.yaml
(4) Decode the received arg in dockerfile to create and deploy a json file

Execution

The deploy was successful. One of the factors that got me stuck in the middle is that if IAM for the gcloud service account is not properly set up, as described in the article here, the build will fail on Github Actions even if the build succeeds.

I checked the behavior of Airflow with CloudRun and the webserver started up successfully and successfully ran the job.

--

--

Sasakky

Data Engineer, Data Architect and Data Analyst in D2C Startup in Tokyo