How to instantly recover a table in Google BigQuery

If you need to Ctrl-Z (undo) after something went wrong, here are a couple of ways to get your data back.

Davor Hafnar
Oct 19 · 4 min read

How much time do you have?

Google gives you 7 days of history for existing tables and 2 days for deleted tables (source). Sow how to quickly restore the data?

Important: Get familiar with the code by running it on dummy data before relying on it. The article is written to help and the author takes no responsibility for the cost incurred or lost data as a result of using the instructions from this article. Also, mind that this article is current as of October 2019 (things will likely work differently in the future.)

2. Restore using the console (no query cost)

If the cost you incur by querying while restoring is an issue, you can use the Cloud Shell. Opening it is simple:

Select the right project, open the console, issue the command.

You will need to convert the time you want to go back to into a Unix timestamp — you can find a converter here. I am using 1577833205000 as a placeholder.

So, to restore data from a specific time in the past and save it to a new table:

bq cp dataset.table@1577833205000 dataset.new_table

This command works also on deleted tables, but only for 2 days since deletion. It also works for partitioned tables.

For overwriting an existing table (when the table wasn’t deleted) just enter the same table name twice:

bq cp dataset.table@1577833205000 dataset.table

You will be asked if you want to overwrite the table. Just enter “y” for yes.

Note that if you are undeleting a table, the second command won’t work: you need to specify a new name for the table, and then copy that table in the place of the old one.

2. Restore data by using SQL (incurs query cost)

2.1 Run a query on a table as it was one hour ago

You may want to simply query a table as it was one hour ago and decide what to do with the result later:

SELECT
*
FROM
`project.dataset.table`FOR SYSTEM_TIME AS OF
TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)

All you need to do is add FOR SYSTEM_TIME AS OF to the end of the basic SELECT query, followed by the specific time you want to go to. The provided code offers an out-of-the-box way (no Unix timestamp needed) to jump back an hour from the time the query is run.

You can find more about FOR SYSTEM_TIME AS OF in documentation.

2.2 Create/replace a table with data from one hour ago

Do you perhaps want to immediately restore the table to a state as it was an hour ago? Here is how to do that:

CREATE OR REPLACE TABLE
`project.dataset.table` AS
SELECT
*
FROM
`project.dataset.table` FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)

The result of this query will be saved in a table you specify in CREATE OR REPLACE; if you use the same name as in the FROM part, you will overwrite the table with the data as it was one hour ago.

2.3 Create/replace a PARTITIONED table with past data

You can do the same with a partitioned table. Below is an example where you have partitioned a table by day using a timestamp-type column in BigQuery and your table requires a partition filter:

CREATE OR REPLACE TABLE
`project.dataset.table`
PARTITION BY
DATE(timestamp_value)
OPTIONS (require_partition_filter=TRUE) AS
SELECT
*
FROM
`project.dataset.table` FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 HOUR)
WHERE
DATE(timestamp_value) <= CURRENT_DATE()

The WHERE condition as written just makes sure that you are allowed to query across all of the data (as every date is equal or smaller than today’s).

2.4 Go back DAYS or MINUTES, not hours

The examples that are shown above will retrieve the data from 1 hour ago. You can, of course, use any other number of hours, and if you need days or minutes, just replace the phrase HOUR with DAY or MINUTE.

2.5 Code in legacy SQL

If you are using legacy SQL, you can go back in time using the query below. You will need to convert the time you want to go to into Unix timestamp. You can find a page to help you do that quickly by clicking here, for example. You then enter the resulting number after the project:dataset.table@ as below ( I used 1577833205000 as a placeholder):

SELECT
*
FROM
[project:dataset.table@1577833205000]

Mind that legacy SQL query doesn’t support partitioned tables.

3. Can I restore a deleted dataset?

At the time of writing, unfortunately, no. To quote Google:

After you delete a dataset, it cannot be recovered, restored, or undeleted. Deleting a dataset is permanent. (source)


That’s it! I suggest you get familiar with how the functionality to go back in time works so you will know what to do if/when you actually need it.

Davor Hafnar

Written by

Data product guy. All opinions are my own. http://hafnar.si

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade