Reduce your BigQuery bills with BI Engine capacity orchestration

Learn how we at REEA are orchestrating BI Engine reservations by dynamically changing allocation size based on usage using Cloud Workflows on a 5TB dataset, considered small for BigQuery but with big cost savings and accelerated queries.

This article resulted from my experience at REEA, and as GDE on Stackoverflow, where I observed a lot of questions that raise problems over usage and how BI Engine works in accordion with DataStudio and BigQuery.

Introduction

By turning on BI Engine you expect accelerated queries and lower bills. In Google Cloud, everything is measurable and accessible by API, or CLI commands. Those audit logs are so helpful. So we can say, that by having access to the metrics we could achieve more optimal usage of our BI Engine reservation. The question we ask:

Is there a way to resize BigQuery BI Engine to match on-demand query usage, applying daily usage patterns using some automation?

This article covers defining a Cloud Workflow to adjust the BI Engine size parameter up/down to ensure lower costs for your BigQuery on-demand or DataStudio queries. Next, we will see how to use cloud orchestration for making this an automatic process, and combined with BigQuery and Datastudio query complexity that might save precious development time, lower bills, faster queries.

Note: To get started with Cloud Workflows, check out my introductory presentation about Cloud Workflows and previous articles.

Problem definition

The aim is to dynamically adjust the size of the BI Engine to get the lowest combined cost of BigQuery and BI Engine.

In other words, by turning on BI Engine you incur new cloud bill costs and you get benefits, lower BigQuery bills, but if you apply wise the combined costs can be huge savings on your cloud bill.

By monitoring the resulting savings on BigQuery together with the cost of running a known size BI Engine you can get the math if your move to increase or lower the BI Engine capacity will result in lower bills.

So the biggest challenges that we need to resolve are:

  • Obtain the cost of your on-demand BigQuery usage during the experiment
  • Set the BI Engine capacity in steps in an orchestrated way
  • Have a real-time sense of the savings to drive capacity automation up/down
  • Monitor the applied settings for optimal savings

Prerequisite:

  • Audit logs exported to BigQuery (historically better)

To turn this into a working solution we will use Cloud Workflows.

We at REEA are open to consulting to provide specialized knowledge to your BigQuery usage to see if your bills can be reduced using this technique or by other means.

Get the cost of your BigQuery usage

We rely on the fact that you are familiar with BigQuery and with the fact that Google DataStudio will execute any BigQuery integration as queries. To get started with Cloud Workflows together with BigQuery, I recommend reading my previous article which is introductory about these: Automate the execution of BigQuery queries with Cloud Workflows.

The easiest and non-technical way to get the cost is by going into the Billing Reports and filtering only for service=BigQuery costs, and grouping by Service.

BigQuery usage over several months
BigQuery usage over several months

In this chart, you see the standard BigQuery usage many small to medium companies have. As data grows the cost of your queries is a bit higher. Each month there is a 1TB of free quota that do not count toward your bills and result in clear dips on the first day of the month.

Going technical

The technical way to get the cost of your BigQuery usage is to run a query over the audit logs that are exported to BigQuery.

Often organizations already have this setup, if you don’t, then make sure you set up audit logs exporting to BigQuery first and wait a few days to accumulate data.

We won’t go deep dive into how to setup audit logs to BigQuery, here are some references to read BigQuery audit logs overview, then Defining a BigQuery log sink using gcloud, the gcloud logging sinks create cli command, or the Console Page.

Our organization exports ALL logs without filters to BigQuery, to verify what sinks you have run the below gcloud command

$ gcloud logging sinks list --format=json{
"destination": ".../projects/<prj>/datasets/<dataset_auditlog>",
"filter": "(empty filter)",
"name": "bigquery_sink_name"
}

The query to get the recent costs for each BigQuery query:

The query uses a flat rate of 5 USD to calculate the cost of a 1TB on-demand query according to the GCP costs table.

The output is this:

Query cost obtained from AuditLogs

What you need to observe here is that the two numbers are exactly the same.

NEXT, By turning on BI Engine, the values of the billed bytes will drop significantly, and yield savings.

Introducing BigQuery BI Engine

BI Engine integrates out of the box with many services

BI Engine is a fast, in-memory analysis service that integrates out of the box with BigQuery, and Google Data Studio, or popular BI tools such as Looker, Tableau, Power BI, and your custom application to accelerate data exploration and analyses.

By using BI Engine you can analyze data stored in BigQuery with sub-second query response time and with high concurrency served in-memory cache.

In a nutshell, when you have BI Engine turned on, all your BigQuery queries benefit from it like it was served from Memcache.

It’s a managed Google Cloud service, so it integrates nicely, you don’t need to enable a code block or deploy a modification to your queries, also you don’t need to provision infrastructure.

It’s a true hidden gem in the cloud for all those organizations that rely heavily on BigQuery. One of the great benefits is speed, but the other is reduced costs, all this without development time, as integrates out of the box to BigQuery, just by setting the right capacity.

It has very few configuration settings: location and capacity. It has some limitations that are documented here, in other words, not all types of queries are yet accelerated.

Set the BI Engine capacity in steps in an orchestrated way

You can experiment with turning on BI Engine for your project from the console, and see how it performs for your queries over time, by using BigQuery Console and methods described here.

Today’s goal: automate setting the right size of the BI Engine with Cloud Workflows.

What is Cloud Workflow?

In a nutshell, Workflows allows you to connect services together, anything that has a public API.

  • Workflow orchestration (step engine as a service)
  • Integrate any Google Cloud API, SaaS API, or private APIs
  • Out of the box authentication support for Google Cloud products
  • Fully managed service — requires no infrastructure or capacity planning
  • Serverless with Pay-per-use pricing model

By checking the BI Engine API documentation we need to locate the right call where we can set the capacity.

The REST API is a bit confusing, it doesn’t have a create BI Engine endpoint, this literally stops a lot of developers to look forward, but the update endpoint lets you size up or down to zero as well.

We are going to use projects.locations.updateBiReservation call documented here.

There is the following note on that page that explains how to reset as well.

A singleton BI reservation always exists with default size 0. In order to reserve BI capacity it needs to be updated to an amount greater than 0. In order to release BI capacity reservation size must be set to 0.

And there is not much left, we can build out the Workflow that calls this patch call.

What you see here is that we the project from the ENV variables, the location in our experiment is US. In this setBiEngineCapacity step, we are issuing a PATCH call to the API URL that is described above. Authentication is handled by Cloud Workflows, and you need to make sure the Workflow invoker service account has the BigQuery Admin Role with bigquery.bireservations.update permission. The size parameter is in bytes. We use a flat 2GB in our experiment, but this can be parameterized.

If we want to turn off BI Engine, we need to set the size to zero.

Deploy the Workflow, invoke it, and check on the console the updated size.

Based on our discussions with BigQuery engineers updating the capacity takes up to 5 minutes for changes to propagate through production systems.

Also if you want to automate in shell scripts, use this:

bq update — reservation — bi_reservation_size=<Size_in_bytes> — location=US — project_id=<Query_project>

Life after BI Engine turned on

If you check your individual SELECT statements by these methods, you should have a specific block that tells if the BI Engine acceleration was disabled, partial, or full used on your queries. We need to mention here, that you might find only some of your queries are accelerated, and if you find a common reason for that, for example we can name one: queries on more than 500 partitions are not accelerated, you might consider rewriting them.

Give it a while so BI Engine kicks in, and experiment with the size then run this query to see how it performs:

BigQuery savings when BI Engine is properly sized

What you see here, is a report of savings rollup to an hour and percentage calculated. In this screenshot that percentage is huge, on your first runs you will likely have 5–20% of savings, and it depends a lot on the SELECT complexity and volume of jobs that you run.

Reduced Cloud Billing Costs

After you got this right, in the Billing Reports you will see an updated chart that will make your team super happy as well. It might not be as huge as in our example, but it should be visible.

Cloud Billing costs of BigQuery and BI Engine after capacity optimizations.

There is not much to say here, finally we got our daily costs from $35 down to $8+costs generated by running BI Engine in the $2-$3 interval.

Monitor the applied settings for optimal savings

Now that we have a query that actually tells how efficient is BI Engine cost-wise, the possibilities to make this smart have just opened up.

You could put together a Cloud Workflow that occasionally reads the output of the above query and based on some mapping it increases or downsizes BI Engine.

For example, we could have 4 mappings in consideration

  • 0–20% of benefits, increase the BI Engine capacity incrementally by 5GB
  • 20–60%, increase by 1GB the BI Engine capacity
  • >60% either slow down with the increase of 0.5GB or accept the fact that the nature of your queries cannot be accelerated more.
  • Don't forget about a mapping to downsize when BI Engine didn’t yield the projected savings.
  1. On top of the above, you could develop a little business logic as well, by getting the cost of BI Engine, which is per hour and at the time of writing for theUS location it’s 0.0416 per GB/hour. You could have this in the math of the evaluation, how far you can stretch by increasing the BI Engine to have the benefits.
  2. Also, you could build out capacity mapping over the duration of a workday, you could have office hours demand for more capacity, and lower capacity during the night.
  3. You could also leverage BigQuery ML to write a time-series forecast prediction based on historical data to actually drive the best BI Engine capacity for the “hour slot”.

At the time when the upscaled BI Engine costs more than the rationale of the savings stop increasing the capacity.

To kickstart your mind for ML-driven forecast and optimizations, check out this blog post about Optimize your applications using Google Vertex AI Vizier

Monitoring

You could create a chart in Cloud Monitoring plotting the bigquerybiengine.googleapis.com/reservation/used_bytes
over the bigquerybiengine.googleapis.com/reservation/total_bytes.

This way you could monitor if your BI Engine capacity is filled and used at maximum, and calculate this also in your business logic when you put together the automation.

BI Engine reservation size and used bytes together on Cloud Monitoring chart

Conclusions

In this article, we explored ensuring the right BI Engine capacity for your project, and orchestrating the allocated capacity based on usage, by running BQ queries that project the cost and savings based on Audit logs.

In the end, we would build out a Cloud Workflow, that could read the BigQuery analyses, does an evaluation of results, then it does the math to increase or downscale, and it will call the BI Engine step to apply the changes.

The workflow can be triggered hourly by Cloud Scheduler.

Wrap Up

In the meantime, if you want to check it out, here are some links:

Feel free to reach out to me on Twitter @martonkodok or read my previous posts on medium/@martonkodok

--

--

--

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Recommended from Medium

Kindergarten Resources — A Website For Parents and Teachers

Getting Started in Coding with Python for Beginners (Step 1)

Redis Cache with .Net Core Web API Application and Docker

Mapper and Reducer Program Using MongoDB

OTC Market Challenges and Barter’s Solutions

Introduction of The IRaceMeta Project

7 OFFSHORE SOFTWARE DEVELOPMENT MYTHS DEBUNKED

How to become an AWS Certified Cloud Practitioner?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Márton Kodok

Márton Kodok

Speaker at conferences, a Google Developer Expert top user on Stackoverflow, software architect at REEA.net, co-founder IT Mures, life-long learner, mentor

More from Medium

Version Control of BigQuery schema changes with Liquibase

Connecting Steampipe with Google BigQuery

Lightweight Data Orchestration using Cloud Workflows and Dataform

Remote Functions in BigQuery