GCS to BigQuery via Workflows
5 min readNov 27, 2023
Czesc 👋
As part of GCS to BigQuery Pipeline via Different GCP Services project, we will be using Workflows as an orchestration tool to transfer the data in Excel dropped in GCS bucket to BigQuery.
It is very similar to Airflow pipeline we developed. But instead of Airflow, we will be using Workflows. So, lets get started.
ETL Flow
Folder Structure
Below is how my repo is structured:
📦gcs-to-bigquery-via-workflows
┣ 📂.github
┃ ┗ 📂workflows
┃ ┃ ┗ 📜deploy_workflow.yml
┣ 📂infra
┃ ┣ 📜main.tf
┃ ┣ 📜providers.tf
┃ ┗ 📜variables.tf
┣ 📂src
┃ ┗ 📜gcs_to_bigquery.yaml
┗ 📜README.md
Source Code
Workflows is a series of steps described using the Workflows syntax, and can be written in either YAML or JSON. For this post, we will be using YAML (code). So, lets take a walkthrough of it.
Step 1: Read Log from Cloud Logging
main:
params: [event]
steps:
- log_event:
call: sys.log
args:
text: $${event}
severity: INFO