GCS to BigQuery via Cloud Function
4 min readNov 9, 2023
नमस्ते 🙏
As part of our GCS to BigQuery Pipeline via Different GCP Services project, we will be using Cloud Function to process our Excel File and export result to BigQuery.
Technologies Used
- GCP Services
- BigQuery
- Cloud Functions
- Cloud Storage
- Workload Identity Federation - GitHub Actions
- Python
- Terraform
Folder Structure
Below is how my repo is structured:
📦gcs-to-bigquery-via-cloud-function
┣ 📂.github
┃ ┗ 📂workflows
┃ ┃ ┣ 📜deploy-cloud-function.yml
┣ 📂infra
┃ ┣ 📜main.tf
┃ ┣ 📜providers.tf
┃ ┗ 📜variables.tf
┣ 📂src
┃ ┣ 📜config.yaml
┃ ┣ 📜helpers.py
┃ ┣ 📜main.py
┃ ┗ 📜requirements.txt
┗ 📜README.md
Source Code
In this section, we will be working on creating the Python code for the Cloud Function.
requirements.txt
Standard stuff
functions-framework==3.2.0
google-cloud-storage==2.12.0
openpyxl==3.1.2
pandas==2.1.2
pandas-gbq==0.19.2
PyYAML==6.0.1
config.yaml
# Project Config
PROJECT_ID: "gcp-practice-project-aman"
REGION: "us-central1"
# BigQuery Config
BQ_DATASET: "raw_layer"
BQ_TABLE: "xlxs_to_csv_pipeline"
# Other Config
JOB_SOURCE: "Cloud Function"
- PROJECT_ID: GCP…