Photo by Marek Studzinski on Unsplash

BigQuery Costs Debunked: Why Your Budget Is Safe (Mostly) 💸

Thosan Girisona
Data Engineering Indonesia
4 min readJun 5, 2024

--

Starting your data warehouse journey can be a bit daunting, especially when you see the price tag. “Big data comes with a big cost” they said. “It’ll eat your infrastructure budget” they said (Okay, that one might be true). But being scared of something imaginary doesn’t really count as real fear! So, let’s dive into the real costs of BigQuery, and I promise it won’t break the bank (too much).

Storage

When it comes to storage, BigQuery costs are divided into physical storage and logical storage. Both types come with two cost categories: active and long-term storage.

Active storage is for tables (or partitions) modified in the last 90 days, while long-term storage is for tables that haven’t been changed in over 90 days.

By default, BigQuery uses logical storage. The main difference is that logical storage costs half as much as physical storage. “Why double?” you ask. Simple: logical storage includes extra features like time-travel and fail-safe by default, and those come with an extra cost. 🕰️

But that doesn’t mean you can’t use time travel and fail-safe with physical storage. You can, but you’ll be billed separately based on your usage. So, while logical storage is cheaper upfront, physical storage offers more control — if you can handle the complexity.

Time Travel lets you query data from the past. Say you deleted 2 GB of rows in table_x yesterday but used the wrong deletion logic. No big deal! With logical storage, you can retrieve a snapshot of that table up to 7 days prior. Phew! And if it's beyond 7 days, the fail-safe feature kicks in. Google Cloud customer care can restore your data up to 7 days after the time-travel period ends. Crisis averted! 🚀

Compute

Now, let’s talk compute. Imagine using BigQuery like taking a trip. If you use a taxi, you pay by the distance — the longer the trip, the higher the cost. But if you rent a car, the range doesn’t matter. Whether you drive around the block or across town, you pay the same daily rate. 🚗

Using a taxi is like the default compute pricing: on-demand compute, because you are billed by the bytes of data you process. The bigger the data, the more you pay.

The keyword here is processed. If you have 1 TiB of data but only select a single column from a small partition, you won’t be charged for the whole 1 TiB — just the bytes selected. Conversely, if you have 1 GiB of data but perform heavy operations like self-joins and window functions, you might be charged for processing more than 1 GiB.

Back to the analogy: while on-demand compute is like taking a taxi, BigQuery also lets you rent the compute itself for a set time. So, no matter how big the the data being processed will be, you’re charged based on the number of compute slots and the time you rent them. The longer your commitment, the cheaper the per-unit time cost. As of this writing, it’s $0.048 per slot hour for a 1-year commitment and $0.036 per slot hour for a 3-year commitment (Enterprise Edition).

Free Tier

Still worried about the cost? What if I told you that you can start using BigQuery for free? Amazing, right? BigQuery offers a free tier that’s perfect for beginners. 🎉

Sure, there are limits. But if you’re just starting with a small amount of data, you might not have to pay a dime. As of this writing, storage is free for the first 10 GiB per month, and compute is free for the first 1 TiB of data processed. Nice!

Best Practices

  • Time travel and fail-safe periods are lifesavers, especially in emergencies. The default 7-day period is configurable, so you can set it as short as 2 days.
  • For daily batch jobs, compare the cost of on-demand vs. flex slots to find the cheaper option.
  • For daily analytics, use flex or annual slots since usage is predictable.
  • Using on-demand compute without strong data governance is madness. Trust me on this!
  • Partition your data so long-term storage costs apply to partitions, meaning part of your table will cost half as much after 90 days of no use.
  • If you’re just getting started, you can use BigQuery for free. Just stay within the free tier limits.

Other Costs

Of course, BigQuery has other costs, like BigQuery Omni, BigQuery ML, and BI Engine. Heck, even the slot compute is divided into multiple “classes.” But covering all of that in one article would be too much. Follow me on Medium and LinkedIn for part two (yes, I’m shamelessly promoting myself here). 😉

References

Credit

I would like to express my gratitude to the Data Engineering Indonesia community for their invaluable insights and reviews, which have enabled me to share this knowledge with a wider audience. Your contributions have played a pivotal role in ensuring the accessibility of this information. Thank you all!

--

--