Estimate Your BigQuery Storage Cost

Vincent Junitio Ungu
Blibli.com Tech Blog
6 min readJun 13, 2023

Have you been using BigQuery as a data warehouse to retrieve data using Structured Query Language (SQL)?

Its flexibility has made me use BigQuery both in my professional working field and for the personal necessity for about 1 year now. In this article, I am going to talk about the BigQuery cost, specifically the storage cost.

Photo by Alexander Mils on Unsplash

In BigQuery, there is a cost for running a query. Google has announced there will be changes to the BigQuery pricing model in July 2023. As of 6 June 2023, there are two pricing models:
1. On-demand pricing: You pay for what you have used. If you process 5 TB of data, you must pay the price for running 5 TB.
2. Slot commitment pricing: You reserve a specific compute capacity (slot) and you pay for this slot price. If you book a certain amount of slots and run 5 TB of data, you don’t exactly pay the price of running 5 TB of data.

Note: BigQuery offers free of charge for processing the first 1 TB for each month.

As data analysts, we are expected to write the most efficient SQL query. If the query has efficiently generated the output, it can lead to less processing size than an inefficient query. Hence, the less price you have to pay if you are using the on-demand pricing or less slot consumption if you are using the slot commitment pricing model.

However, there is another cost charged in BigQuery, which is the storage cost. The BigQuery storage cost is charged to save or keep the data in BigQuery.

There are two types of storage:
1. Logical storage: The storage size is derived from the calculation of each data type of the table’s columns. This is the default storage type used in BigQuery.
2. Physical storage: The storage size is derived from the calculation after the data is compressed and stored on disk.

For each type of storage, it applies two pricing models:
1. Active: When your data is being modified in the last 90 days.
2. Long-term: When your data hasn’t been modified in the last 90 days (consecutively).

Note: BigQuery offers free of charge for the first 10 GB of storage for each month.

Retrieved on 6 June 2023, the storage price in the ‘us’ region is as follows.

Storage cost in the ‘us’ region (Source: Google Documentation)

The storage price varies based on the region. For example, in ‘asia-southeast2’, the active logical storage price is $0.023 per GB per month.

Storage cost in the ‘asia-southeast2’ region (Source: Google Documentation)

Calculate your storage cost

Sample table from Google bigquery-public-data project in BigQuery

In the scenario above, there are long term logical bytes and long term physical bytes. Moreover, there is also a time travel physical bytes at the lowest section of the Storage info. Time travel physical bytes are calculated as physical bytes. Let’s calculate both of them below.

Long term logical bytes = 345.73 GB * $0.01 (per GB / month) = $3.4573 / month

Long term + Time travel physical bytes = 54.57 GB * $0.02 (per GB / month) + 0 = $1.0914 / month

The next question might be: Do you need to pay the sum of logical and physical bytes? The answer is no. BigQuery only charges the cost of either logical or physical bytes.

Sample table from Google bigquery-public-data project in BigQuery

Here’s another scenario. Let’s calculate the cost below.

Long term logical bytes = 18.38 MB / 1024 (per GB) * $0.01 (per GB / month) = $0.0001794921875 / month

Long term + Time travel physical bytes = 17.27 MB / 1024 (per GB) * $0.02 / GB / month + 0 = $0.0003373046875 / month

Since the data size is small in the second scenario, the cost doesn’t look significant compared to the first one. The conclusion is that the cost paid depends on your data size and storage type (logical or physical). It doesn't always mean that logical storage costs less than physical storage costs.

Calculate your storage cost with Google Cloud Pricing Calculator

You can also calculate the BigQuery storage cost using the Google Cloud Pricing Calculator.

Google Cloud Pricing Calculator Interface (Source: Google Cloud Pricing Calculator)

In the calculator above, I inserted the storage size in the first scenario. The calculator estimates the cost to be $3.36 / month, but the price calculated above is $3.4573 / month. This discrepancy occurs because I didn’t consider the free 10 GB. If I consider it, the calculation is as follows.

Long term logical bytes = (345.73 GB - 10 GB)* $0.01 (per GB / month) = $3.3573 / month (rounded to $3.36)

Calculate your total storage size

To view your total storage size (logical and physical), you can use the following query.

SELECT
SUM(active_logical_bytes) / power(1024, 3) AS total_active_logical_GiB,
SUM(long_term_logical_bytes) / power(1024, 3) AS total_long_term_logical_GiB,
SUM(active_physical_bytes) / power(1024, 3) AS total_active_physical_GiB,
SUM(long_term_physical_bytes) / power(1024, 3) AS total_long_term_physical_GiB,
FROM
`region-[INSERT_REGION_HERE]`.INFORMATION_SCHEMA.TABLE_STORAGE
;

If you want to measure the storage size for each dataset, you can use the following query.

SELECT
table_schema AS dataset_name,
SUM(active_logical_bytes) / power(1024, 3) AS total_active_logical_GiB,
SUM(long_term_logical_bytes) / power(1024, 3) AS total_long_term_logical_GiB,
SUM(active_physical_bytes) / power(1024, 3) AS total_active_physical_GiB,
SUM(long_term_physical_bytes) / power(1024, 3) AS total_long_term_physical_GiB,
FROM
`region-[INSERT_REGION_HERE]`.INFORMATION_SCHEMA.TABLE_STORAGE
GROUP BY
1
;

Remember to change the [INSERT_REGION_HERE] to your table region.

After retrieving your total logical bytes or physical bytes, you can estimate your storage cost using the calculations mentioned above.

You can also use the code prepared by the Google Team to calculate your storage size along with its price here.

Why is this important?

When we are using a service, we want to make the best use of it. At the same time, we also expect to spend as little as possible. Since this article is about BigQuery, it is important to understand what costs there are in BigQuery. Comparing the storage cost to the processing cost, the number is not significant. However, understanding how we can reduce the storage cost plays a role in cost saving.

  1. Understand active storage and long term storage work: Do you regularly update your data, especially using a scheduler? Have you made sure that you are updating the data that needs to be updated? Imagine that you only need to update/insert the latest data into a table. Instead, you are loading all the data from the previous dates until the latest one and inserting it into a table every day. This approach will lead you to have active storage for that table. Hence, you need to pay more when you actually don’t have to.
  2. Explore other options that offer less cost: For long term data that is less likely to be used at the current moment, but is still important to keep in storage rather than deleting/dropping it, you might consider other options that cost less than BigQuery's long term storage cost.

I hope this article helps you to estimate your BigQuery cost so that you are aware of what you are paying for.

Acknowledge

Special shoutout to Yeremia Marcellius Toemali for helping me to explore this topic and Dennis Kastoni, Jessica Banna, and Anelyse Permana for reviewing this article.

--

--

Vincent Junitio Ungu
Blibli.com Tech Blog

An ambitious, passionate, and determined young learner interested in data analysis, data science, and artificial intelligence.