Using Cloud Workflows to load Cloud Storage files into BigQuery

Márton Kodok
Google Cloud - Community
6 min readNov 30, 2020

--

Loading Data Into BigQuery From Cloud Storage by using Cloud Workflows.
Loading Data Into BigQuery From Cloud Storage by using Cloud Workflows.

In this article, we will orchestrate and automate Google Cloud with serverless workflows.

We will create a Cloud Workflow to load data from Google Storage into BigQuery. This is a complete guide on how to work with workflows, connecting any Google Cloud APIs, working with subworkflows, arrays, extracting segments, and calling BigQuery load jobs.

There are various ways to process Cloud Storage files to BigQuery such as using a Cloud Function, by Eventarc triggers to Cloud Run services, a relatively new syntax is by using BigQuery create external table statement or the good old way via BQ CLI tool.

These require you to maintain a function, a container, a library, or SDK up to date, which means they need maintenance.

We are going to use Cloud Workflows to connect Cloud Storage API with BigQuery Jobs API for loading files into tables. Using the techniques that we’ll cover for this part, you will have a foundation to build any kind of serverless automation in Cloud Workflows, in YAML syntax, without maintenance.

Note: To get started with Cloud Workflows, check out my introductory presentation about: Serverless orchestration with Cloud Workflows.

Cloud Workflows

--

--

Márton Kodok
Google Cloud - Community

Speaker at conferences, a Google Developer Expert top user on Stackoverflow, software architect at REEA.net, co-founder IT Mures, life-long learner, mentor