How-to: Dataform in Google Cloud — part 1

Adrian Trzeciak 🇳🇴
Google Cloud - Community
5 min readOct 20, 2022

--

I had a chance to utilize Dataform (open source CLI version) about a year ago while creating analytics solution for a retailer. At that point in time I ended up wrapping the most important functionality as a REST API in Go and used Cloud Scheduler to invoke the necessary transformations based on tags. Every part of that solution was deployed on GCP with Terraform. Well, Google has chosen almost the same approach with Cloud Workflows as the wrapper around Cloud Scheduler.

It seems like Google quietly released a preview version of Dataform during Next ’22. Prior to that, Dataform was either available solely as open source CLI version or — if you were early to the party — as a web application with beautiful dependency graphs and such.

Why Dataform?

Why should you consider using Dataform? The price for cloud storage has dropped significantly throughout the years making raw data affordable to keep, which resulted in a transition from good-old ETL (where transformation would happen before the load) to brand-new ELT (where transformations are applied after the data has been loaded). Data transformations play a critical role for many data-driven organizations. Dataform provides you with following functionality which I suggest you take a look at:

  • Everything from table definitions, views to…

--

--