How to create and track BigQuery data alerts

Alex Feldman
4 min readJan 15, 2023

--

Alert
Photo by Sigmund on Unsplash

For tracking specific data events and errors in BigQuery tables, we can use simple and efficient native GCP tools.

This article considers how to create the BigQuery alert using the debugging statement, deploy it by the scheduled queries service, set email notifications, and collect messages in the BigQuery table using Pub/Sub subscription.

Alert creation

We can use the ASSERT debugging statement to create the alert — a query that checks a table by a specific condition. For example, let's write a query that checks the table for duplicate IDs.

To illustrate this, let's create a table of days of the week with two Saturdays and their IDs.

CREATE TABLE `my_dataset.dow_table` AS
SELECT 1 as id, 'Monday' as dow
UNION ALL SELECT 2, 'Tuesday'
UNION ALL SELECT 3, 'Wednesday'
UNION ALL SELECT 4, 'Thursday'
UNION ALL SELECT 5, 'Friday'
UNION ALL SELECT 6, 'Saturday'
UNION ALL SELECT 7, 'Sunday'
-- and extra Saturday
UNION ALL SELECT 6, 'Saturday'

The ASSERT statement that checks duplicate IDs looks like this:

ASSERT NOT EXISTS (
SELECT *
FROM `my_dataset.dow_table`
QUALIFY ROW_NUMBER() OVER (PARTITION BY id) > 1
) AS "The DOW table has the duplicate IDs"

If a table contains duplicate IDs, the query returns the text we defined in the description part. If it is not, the query returns nothing. We use the NOT EXISTS argument because if the query's result does not exist, it's ok for us, but if not, something is wrong.

The result of the query contains the message:

The alert is ready.

Deploying and notifications

Let's deploy the alert by scheduled queries service for daily execution and send a notification about the message. Menu SCHEDULE — Create a new scheduled query.

Scheduled query creation

Setting the email notification checkbox lets us receive the error message to the query owner's email. Another option is to put a Pub/Sub subscription name (we should create it before) that will collect data from all our alerts and store it in the BigQuery table for the following analysis.

Scheduled query creation

Although the email notification option doesn't allow us to set up notifications, not only for the query owner, we can do it on the email box side, setting email forwarding. This way, we can send messages to colleagues.

Pub/Sub subscription creation

Let's create the Pub/Sub subscription for collecting all messages in one place. The first step is creating a table to store our messages. The table should have a specific schema.

CREATE TABLE `my_dataset.assert_messages` (
subscription_name STRING,
message_id STRING,
publish_time TIMESTAMP,
data STRING,
attributes JSON
)

Then let's go to the Pub/Sub service page:

Pub/Sub launch

And create a new topic — bigquery_asserts, by clicking on the CREATE TOPIC button.

Topic creation

We should define the subscription name. No need to select any checkbox. After creating the new subscription, click on it, and choose the CREATE SUBSCRIPTION button on the opened window.

The topic screen

Next, we should create a topic subscription with the Write to BigQuery delivery type and specify the assert_messages table as the destination.

Topic subscription creation

If you see such a warning message, you must grant permission for the service account. DevOps guys can help with it.

Warning message

After the alert runs, we can see the result in the assert_messages table:

The assert_messages table

The data field contains a JSON string with message details.

So we created the alert that every morning checks duplicate IDs in the dow_table every morning sends notifications by email, and saves messages to the BigQuery table.

We can use alerts to control huge data parameters, such as unique values, check that data for specific periods were uploaded in BigQuery, monitor particular business metrics, detect unusual user behavior, etc.

--

--