How to Use Python & SQL to Append New Rows to a BigQuery Table Without Overwriting Your Data

How to dynamically update rows in BigQuery using python and SQL without losing your historical data.

Zach Quinn
Pipeline: Your Data Engineering Resource

--

A python, coiled.
Photo by David Clode on Unsplash

Although Google’s BigQuery is intended to be an append-only database, there are use cases in which it would be necessary for data engineers to update only a single row based on new data. However, getting to this point requires some Python and SQL hacking since BigQuery only supports truncate and append methods separately. While you could ingest results into one table and load them into another, there is a way to append and truncate simultaneously using the BigQuery API with Python.

The Data & BigQuery Schema

For this example, I’ll generate a few rows of fake data in a CSV. I created a ‘sales’ dataset which contains the ‘totals’ table.

This is the initial schema as it appears in BigQuery. This schema underlies the ‘totals’ table which lives in the ‘sales’ dataset (both of which I created in my GCP project).

BigQuery table schema.
Screenshot by the author.

--

--