Decoupling Dataflow with Cloud Tasks and Cloud Functions

Charles Verleyen
Google Cloud - Community
5 min readJun 17, 2020

--

Are you developing data pipelines on Google Cloud and you sometimes struggle to choose the right product ? Do you feel like a combination of various products could be suitable to your solutions and don’t know which one to use in which scenario? If your answer to these questions is yes, carry on reading. In this post, I’ll explain how you can decouple Google Cloud Dataflow with Cloud Tasks and Cloud Functions.

Be daring, be different, be impractical, be anything that will assert integrity of purpose and imaginative vision against the play-it-safers, the creatures of the commonplace, the slaves of the ordinary. — — Cecil Beaton

We (a team of developers at Fourcast) recently had to develop a pipeline for a customer that had to fetch data from BigQuery, apply transformations, save the transformed data on a bucket on Google Cloud Storage and then send the results as json to an endpoint.

That all sounds like a perfect use case for Google Cloud Dataflow and that’s how we started.

Leveraging Google Cloud Dataflow

As the amount of data was quite important, we wanted to leverage Dataflow, the highly scalable product of Google Cloud that is simple to start with thanks to the Apache Beam library.

--

--