Unlocking the Power of Time Travel with BigQuery: Use Cases and Applications

Katarina Nimas Kusumawati
Blibli.com Tech Blog
8 min readMar 4, 2024

Introduction

Imagine losing a critical piece of information, like a deleted document or an accidentally changed report. It’s frustrating, right? Well, with BigQuery, a cloud-based data warehouse, that frustration might become a thing of the past. Think of it as a time machine for your data, allowing you to travel back in time and recover lost information or undo those accidental changes.

Before we jump into the how-to, let’s talk about this “time travel” concept in simpler terms. Here’s what BigQuery time travel is:

  • Recover deleted tables: Accidentally deleted a table full of important data? BigQuery keeps a history of your tables, allowing you to rewind and restore them to a specific point in time before deletion.
  • Uncover changes in tables: Did someone accidentally modify a crucial data point? BigQuery tracks all changes made to your tables, just like a historical record. You can travel back in time and see exactly what got changed. No more confusion or lost information!

In this article, we’ll simplify the concept into easy steps, perfect for anyone new to data analysis. We’ll walk you through:

  • How to recover deleted tables and get your lost data back.
  • Discovering and understanding changes made to your tables.
Photo by Ilya Pavlov on Unsplash

The Advantages of Time Travel

Data Recovery

  • Accidentally deleted data: Easily recover data that was accidentally deleted or overwritten within the time travel window (typically 7 days), preventing costly data loss.
  • Undo mistaken changes: If you made an incorrect update, you can rewind to a previous state of the data to undo the mistake.
  • Restore expired tables: Retrieve data from expired tables that haven’t been completely purged yet.

Analysis and Troubleshooting

  • Trend analysis: Analyze historical trends in your data by looking at different snapshots over time. This helps identify patterns and understand how your data has evolved.
  • Debugging errors: Investigate issues in your data by examining how it looked at different points in time, pinpointing the source of the problem.
  • Compare data versions: Easily compare different versions of the same data to understand changes and their impact.

Limitations of Time Travel

  • Time travel is not a substitute for proper data backup and retention strategies.
  • There are limits to time travel, such as the maximum window size and the need for specific permissions for restoring deleted tables (BigQuery Admin).

Configure Time Travel

  • Time travel is on the dataset level. Open the dataset that you want to set the time travel.
  • Click Edit Details then go to Advanced Options
  • At Time Travel Window, set the maximum days of time travel. The default is 7 days.
  • Click Save

Recover Hard Delete Tables

When a table is delete in BigQuery, it’s like it vanishes into thin air.

Example:

Table A was deleted on December 7, 2023, 10:00 AM. The maximum time travel time is December 14, 2023, 10:00 AM. If you time travel on December 14 2023, 10:05 AM, the table cannot be recovered.

How to:

  • Activate Cloud Shell BigQuery
  • Make sure you will see the picture below
  • To time travel the table use this code below (customize the bold characters)
    Timestamp use millisecond
    1 hour = 3600000 milliseconds
    1 day = 86400000 milliseconds
    7 days = 604800000 milliseconds
    Code:
bq - project_id=project-id cp dataset_id.table_id@-timestamp dataset_destination.table_destination
  • First Example:
    I want to time travel table myapp-326303.dataapp.temp_katarina_test that I just deleted around yesterday.
    Code:
bq - project_id=myapp-326303 cp dataapp.temp_katarina_test@-86400000 dataapp.temp_katarina_test
  • Second Example (Save the table with different path and different name):
    I want to time travel table myapp-326303.dataapp.temp_katarina_test that I just deleted around yesterday. But I want to place it in dataapp_2 with name temp_katarina_test2.
bq - project_id= myapp-326303 cp dataapp.temp_katarina_test@-86400000 dataapp_2.temp_katarina_test2
  • Click Authorize if you see this picture below
  • If time travel success you will see success message like this picture below

First example

Second example

  • Check if the table success to recover.

Note for Storage Optimization Cost

If you time-travel on a dataset that has an expiration date, returning the table is the same as creating a new table which makes the expiration date longer. Please consider the urgency of returning the table for cost optimization.

Recover Data Changing Tables

When the data in a table changes and you want to return to the previous version of the data.

  • With the “FOR SYSTEM_TIME AS OF” clause, you can pinpoint any point in time within the past (within the time travel window) and analyze your data exactly as it was then. This is useful for various tasks like tracking changes, analyzing trends, and recovering deleted data.
  • Note: To restore data from a deleted table, you need special admin access. Even having owner rights isn’t enough. Make sure you have the “bigquery.admin” role for that table.

Example:

I have table that contains number of names in USA. I want to change the first name of the table.

After I change the data, the data looks like this.

Oops, I want to change the data to the previous version. This how to do it.

How to:

Check the data

  • To get correct timestamp, you can check the data first by using this query.
SELECT *
FROM project_name.dataset_name.table_name -- customize project_name, dataset_name, and table_name
FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL XX TIMEUNIT); -- customize the time

The query to execute it:

SELECT *
FROM myapp-326303.dataapp.temp_katarina_test
FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 2 MINUTE)
ORDER BY 1

This is the result after checking the data:

We can see the data back to the version 2 minutes ago.

  • Save the table to the same table or different table. For this example, I want to save it into the same table.
CREATE OR REPLACE TABLE project_name.dataset_name.table_name AS -- customize project_name, dataset_name, and table_name
SELECT *
FROM project_name.dataset_name.table_name -- customize project_name, dataset_name, and table_name
FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL XX TIMEUNIT); -- customize the time

The query to execute it:

CREATE OR REPLACE TABLE ` myapp-326303.dataapp.temp_katarina_test` AS
SELECT *
FROM `myapp-326303.dataapp.temp_katarina_test`
FOR SYSTEM_TIME AS OF TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 MINUTE)
ORDER BY 1;
  • Check if the table successfully recovered.

Impact of BigQuery Time Travel on Our Organization

Let’s deep down into the ways time travel deletion in BigQuery can empower our office organization. We’ve explored its potential, now let’s see it in action:

Boosting Data Integrity: Accidentally deleting a crucial table. With time travel, simply rewind to the moment before deletion and recover it instantly. This eliminates data loss, ensuring accurate records and informed decision-making.

Enhanced Collaboration: Time travel make seamless collaboration. Team members can access historical versions of table data, understand changes, and revert to previous versions if needed. This promotes transparency, accountability, and efficient teamwork.

Experimentation Without Fear: Want to test new data analysis methods without compromise existing data? Time travel allows us to experiment freely, knowing you can always revert to the previous state if needed.

Cost-Effective Data Management: BigQuery’s time travel feature eliminates the need for separate backups and version control systems. This reduces storage costs and simplifies data management, freeing up resources for other priorities.

Here are some examples of how BigQuery Time Travel has helped us:

  1. Restoring a table deleted due to expiration rules
    We have a dataset with tables that have expiration rules. One day, a table was automatically deleted due to this rule. However, the table was still being used by the team. Fortunately, we were able to use BigQuery Time Travel to restore the table.
  2. Restoring a table accidentally deleted
    Once, a table write-append was accidentally deleted. Fortunately, the table had not been deleted for long, so we were able to use BigQuery Time Travel to restore it.
  3. Restoring a table to a previous version
    A table was modified and the data did not meet expectations. We wanted to revert the table to the version before the change was made. Fortunately, we were able to use BigQuery Time Travel to do this.

Important Notes

Define the retention policy: Determine how long you need to retain historical data for compliance or analysis.

Educate your team: Train your team on using time travel responsibly to ensure data integrity.

FAQ About Time Travel

If a table has multiple versions, which version will be returned?
Example: if we have several versions of a table

  • If we time travel at 2024–01–20 03:00 AM, interval 4 hours, we got error
  • If we time travel at 2024–01–21 11:00 AM, interval 3 hours, we got v2
  • If we time travel at 2024–01–25 01:00 PM, interval 2 days, we got v3
  • If we time travel at 2024–01–28 01:00 PM, interval 30 minutes, we got v4

Does 7 days (or less) retention apply to hard delete tables and data changing?

  • Yes

If there is a table X that has 2 versions of data. v0 at 10:00 AM and v1 at 11:00 AM. Then do a hard delete at 01:00 PM.

Then we want to restore the table.

Can I restore v0 from the table by recovering hard delete table then recovering data changing in tables?

  • No you can’t. The table we return is only the latest version, v1, it cannot be returned to the previous version again.

So, there you have it! BigQuery’s time travel features give you ability to recover lost data and track changes, ensuring your data remains protected and accurate. Remember, even beginners can use these features — don’t hesitate to explore and experiment! Let’s start exploring BigQuery today!

--

--

Katarina Nimas Kusumawati
Blibli.com Tech Blog

Sometimes I struggle with data, sometimes I just wanna be a Pikachu