Calling Cloud Composer to Cloud Functions and back again, securely

salmaan rashid
Jun 10 · 5 min read

Sample Cloud Composer (Apache Airflow) configuration to securely invoke Cloud Functions or Cloud Run.

In addition this sample shows inverse: how Cloud Functions can invoke a Composer DAG securely. While GCF->Composeris documented here, the configuration detailed here is minimal and (to me), easier to read.

Setup

1. Create Composer Environment

export GOOGLE_PROJECT_ID=`gcloud config get-value core/project`
export PROJECT_NUMBER=`gcloud projects describe $GOOGLE_PROJECT_ID --format='value(projectNumber)'`
gcloud composer environments create composer1 --location us-central1 gcloud composer environments list --locations us-central1
┌───────────┬─────────────┬─────────┬──────────────────────────┐
│ NAME │ LOCATION │ STATE │ CREATE_TIME │
├───────────┼─────────────┼─────────┼──────────────────────────┤
│ composer1 │ us-central1 │ RUNNING │ 2019-05-21T20:35:21.960Z │
└───────────┴─────────────┴─────────┴──────────────────────────┘

2. Add Python Packages and GCF Connection URL

The following steps sets up Airflow connections we will use internally. The commands below describes a URL to a GCF function we will enable later.

  • Configure requirements.txt
gcloud composer environments update composer1  \
--update-pypi-packages-from-file requirements.txt --location us-central1
  • Configure connection
gcloud composer environments update composer1  \
--update-env-variables=AIRFLOW_CONN_MY_GCF_CONN=https://us-central1-$GOOGLE_PROJECT_ID.cloudfunctions.net --location us-central1

Note: each of these commands takes ~10mins; go grab a coffee.

  • Verify configurations via cli and on the Cloud Console for Composer
gcloud composer environments describe composer1 --location us-central1

The following will list the default GCS bucket used for its configurations and DAG storage

gcloud composer environments describe composer1 --location us-central1 --format="get(config.dagGcsPrefix)"

and also see the GCP Console:

  • Config:
  • Env
  • Python Packages

3. Identify the client_id used by IAP

Cloud Composer is shielded by Cloud Identity Aware proxy. The following command will identify the oauth2 client_id it uses which we will later need to trigger DAGs externally from GCF. For refrerence, see triggering with gcf

  • Get ariflow URL:

If you are an Editor on the project running Airflow, you should have Editor rights to invoke the endpoint:

(the follwing command uses jq cli to parse JSON)

$ curl -s  -H "Authorization: Bearer `gcloud auth print-access-token`" https://composer.googleapis.com/v1beta1/projects/$GOOGLE_PROJECT_ID/locations/us-central1/environments/composer1 | jq [.config.airflowUri]

In my case, the URL for Airflow is:

[
"https://r1d366b885bb81b73-tp.appspot.com"
]
  • Use the URL to extract the client ID

Attempt to make an unauthenticated call to the URL. You should see an error but within the curl output you will find the elusive client_id:

curl -v https://r1d366b885bb81b73-tp.appspot.com

eg, in my case the command above showed

location: https://accounts.google.com/o/oauth2/v2/auth?client_id=491562778408-sj8hb4035bp7ui918ra0i9qbhbqnejk1.apps.googleusercontent.com&...

which means the client_id is 491562778408-sj8hb4035bp7ui918ra0i9qbhbqnejk1.apps.googleusercontent.com

Note the client_id and composer_url:

target_audience = `491562778408-sj8hb4035bp7ui918ra0i9qbhbqnejk1.apps.googleusercontent.com`url = `https://r1d366b885bb81b73-tp.appspot.com`

4. Deploy DAGs

  • Deploy the DAG that sends authenticated calls to GCF:

Edit to_gcf.py and replace the following line with your projectID

target_audience = 'https://us-central1-$GOOGLE_PROJECT_ID.cloudfunctions.net/echo_app_python'

then

gcloud composer environments storage dags import \
--environment composer1 --location us-central1 \
--source to_gcf.py
  • Deploy the DAG that receives authenticated calls from GCF:
gcloud composer environments storage dags import \
--environment composer1 --location us-central1 \
--source from_gcf.py

5. Deploy GCF

Edit main.py and update target_url and url with the values from step 3:

in my case:

target_audience = '491562778408-sj8hb4035bp7ui918ra0i9qbhbqnejk1.apps.googleusercontent.com'

url = 'https://r1d366b885bb81b73-tp.appspot.com'

then deploy

gcloud functions deploy  echo_app_python --region=us-central1

6. Set IAM Permissions

Now set IAM permissions to

Allow Composer to call GCF

When we setup composer, we did not specify the serivce account it should run as. By default, it will use the compute engine service account which is in the form:

$PROJECT_NUMBER-@developer.gserviceaccount.com

the apply:

gcloud alpha functions add-iam-policy-binding echo_app_python  \
--member serviceAccount:$PROJECT_NUMBER-compute@developer.gserviceaccount.com \
--role roles/cloudfunctions.invoker

Allow GCF to call Composer

During our setup of Cloud Functions, we did not specify a service account. By default GCF will use an account in the form:

$GOOGLE_PROJECT_ID@appspot.gserviceaccount.com

so using that, go to the Cloud Consoles IAM page and for that account, add the Composer User IAM role

GCF invokes a DAG directly using the Experimental Rest Endpoint

7. Invoke DAG directly

The default DAG callgcf DAG is set to run every 30minutes. However, you can invoke it directly if you want via the UI or CLI:

On the console, you should see invocation back and forth:

  • callgcf:
  • fromgcf:

References


Appendix

The following snippets details how to invoke the DAG directly using a service_account json file.

Note: you must first allow that service accout IAM permissions the Composer User first

  • service_account_dag.py
  • curl
$ curl -X POST -d '{"conf":""}' -H "content-type: application/json" -H "Authorization: Bearer $ID_TOKEN"  
{
"message": "Created <DagRun callgcf @ 2019-05-22 09:11:46: manual__2019-05-22T09:11:46, externally triggered: True>"
}

Conclusion

So..what have we done? well, a rudimentary integration that uses authentication to call CloudFunctions. Not only can you call cloud functions and back again, but you can do this with authentication! You can also invoke other GCP Services that provide oauth2 or oidc tokens. For more information, see:

Oauth2:

OIDC:

Google Cloud Platform - Community

A collection of technical articles published or curated by Google Cloud Platform Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

salmaan rashid

Written by

Google Cloud Platform - Community

A collection of technical articles published or curated by Google Cloud Platform Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.