Read/Write From Any Google API To/From BigQuery In Minutes Using BQ Flow

Paul Kenjora At Google
StarThinker
Published in
4 min readJan 11, 2021

The first step to any solution, dashboard, automation, or analysis is getting the data into one place, and in the case of automation from that place back into a product such as Campaign Manager, Adwords API, or Display Video.

At Google, that place is quickly emerging as BigQuery. It scales, it has a menagerie of functions, and it supports JSON which is used in all Google API calls. Plus, its SQL, which is so much faster, simpler, and easier to create and maintain than coding Python. Making the data in/out steps simple unlocks the door for Data Scientists who excel at SQL.

Yes, there are connectors in BigQuery and DataStudio that facilitate data importing, but for now they need to be set up manually which hinders building scripted workflows. Currently there are no write connectors in BigQuery so theres still tremendous value in a read/write connector.

Introducing BQ Flow

That connector is BQ Flow, built solely for the purpose of transferring data between any Google API and BigQuery as quickly, reliably, and easily as possible. It can be made to work with AWS or Azure here, but thats another article.

Universal

No need to write any more Google API to BigQuery connectors, this code is universal and will even download or send your email if you use gMail. At gTech, leveraging BQ Flow has removed the need to write hundreds of custom API connectors, saving us months of work on projects.

Easy

Fill in a JSON recipe and run it. The underling StarThinker provides several other deployments ranging from Composer or Colab, to App Engine UI. See recipe documentation for details on creating more complex recipes with several API calls chained together. For example the Display Video User Audit uses several BQ Flow calls chained together to construct a full solution.

{
"script":{
"license":"Licensed under the Apache License, Version 2.0",
"copyright":"Copyright 2020 Google LLC"
},
"tasks": [
{ "google_api": {
"auth": "user",
"api": "displayvideo",
"version": "v1",
"function": "partners.list",
"kwargs":{"fields":
"partners.displayName,partners.partnerId,nextPageToken"},
"results": {
"bigquery": {
"dataset": "DV_SomeThing",
"table": "DV_Partners"
}
}
}}
]
}

Fast

There is even a command line helper that can perform the same tasks as the workflow recipe. This helper makes it easy to explore API calls for reference or do a quick data load into BigQuery for that one off dashboard.

python3 -m pip install starthinkerst_google_api -api doubleclickbidmanager -version v1 -function advertisers.list -kwargs '{ "partnerId": 132865172 }' -u [credentials path]

Reliable

The BQ Flow / Google API task is built on top of Google’s Discovery Documents ensuring all data moved to and from BigQuery adheres to the official product schema definition. It even adds ENUMS into the descriptions of fields in BigQuery. API versions can be specified for complete compatibility.

Schema example for Display Video Partners List API call vi BQ Flow, including ENUMS.

Flexible

Perhaps the coolest feature of BQ Flow / Google API is the ability to pass in parameters from Sheets or BigQuery using kwargs_remote. For example loading a list of Line Items based on a list of Advertisers in another table. For a full description of this schema see the get_rows function documentation. Which can be extended to accommodate AWS or Azure.

{ "google_api": {
"auth": "user",
"api": "displayvideo",
"version": "v1",
"function": "advertisers.lineItems.list",
"kwargs_remote":{
"bigquery":{
"dataset":"DV_Targeting_Audit",
"query":"SELECT CAST(advertiserId AS STRING) AS advertiserId FROM `DV_Targeting_Audit.DV_Advertisers`;",
"legacy":false
}
},
"iterate":true,
"results": {
"bigquery": {
"dataset": "DV_Targeting_Audit",
"table": "DV_LineItems"
}
}
}}

Use Cases

A universal Google API / BigQuery connector unlocks the best of BigQuery, the ability to easily manipulate, join, and construct combined data sets is the core of innovative ads solutions. With BQ Flow the following use cases take hours not weeks.

Join 1P And Google Data For Automation

First party data is the secret sauce for any client / partner. Pull down a DV360 list of Line Items, join with 1P in store performance data, and write Frequency Caps or Budget Segments back into Display Video.

Create Global Audit Dashboards

Load all the targeting settings under a partner and perform a complete audit of all targeting down to the Line Item level. Identify outliers and quickly plug any brand safety issues.

Development Roadmap

We ran an experiment with BQ Flow that leveraged BigQuery naming conventions to completely control the read / write behavior of the workflow. Basically naming a table the same as an API endpoint would cause it to trigger a read or a write. We’ll continue to explore this at gTech but so far the process is proving only applicable to fully automated tasks like conversion uploads. Where user input or control is required the process will need be evolved.

In the meantime, the core BqFlow ability to universally read and write to BigQuery is proving transformative in reducing the amount of code we need to write.

--

--