Use Log Analytics for BigQuery Usage Analysis on Google Cloud

Xiang Shen
Google Cloud - Community
5 min readApr 15, 2024

On Google Cloud you can use Log Analytics to query and analyze your log data, and then you can view or chart the query results.

BigQuery is Google Cloud’s fully managed enterprise data warehouse that helps you manage and analyze your data with built-in features like machine learning, geospatial analysis, and business intelligence.

While BigQuery offers built-in observability capabilities like the INFORMATION_SCHEMA views, detailed logging remains crucial for in-depth usage analysis, auditing, and troubleshooting potential issues.

This article will walk you through how to analyze BigQuery logs using log analytics.

Upgrade Log bucket

First, if you haven’t, you need to configure Cloud Logging to upgrade all the existing log buckets with Log Analytics enabled.

To upgrade an existing bucket to use Log Analytics, do the following:

  1. In the navigation panel of the Google Cloud console, select Logging, and then select Logs Storage.
  2. Locate the bucket that you want to upgrade.
  3. When the Log Analytics available column displays Upgrade, you can upgrade the log bucket to use Log Analytics. Click Upgrade.
    A dialog opens. Click Confirm.

Perform BigQuery Activities

Complete the following tasks to generate some BigQuery logs. In the tasks, the BigQuery command line tool bq is used.

Task 1. Create datasets

Use the bq mk command to create new datasets named bq_logs and bq_logs_test in your project:

bq mk bq_logs
bq mk bq_logs_testbq mk bq_logs_test

Task 2. List the datasets

Use the bq ls command to list the datasets:

bq ls

Task 3. Delete a dataset

Use the bq rm command to delete the a dataset (select y when prompted):

bq rm bq_logs_test

Task 4. Create a new table

bq mk \
--table \
--expiration 3600 \
--description "This is a test table" \
bq_logs.test_table \
id:STRING,name:STRING,address:STRING

You should have a new empty table named test_table that has been created for your dataset.

Task 5. Run some example queries

You can run a simple query like the following to generates a log entry. Copy and paste the following query into the BigQuery Query editor:

bq query — use_legacy_sql=false ‘SELECT current_date’

The following query will leverage weather data from the National Oceanic and Atmospheric Administration (NOAA). Copy the query into the BigQuery editor and click RUN.

bq query --use_legacy_sql=false \
'SELECT
gsod2021.date,
stations.usaf,
stations.wban,
stations.name,
stations.country,
stations.state,
stations.lat,
stations.lon,
stations.elev,
gsod2021.temp,
gsod2021.max,
gsod2021.min,
gsod2021.mxpsd,
gsod2021.gust,
gsod2021.fog,
gsod2021.hail
FROM
`bigquery-public-data.noaa_gsod.gsod2021` gsod2021
INNER JOIN
`bigquery-public-data.noaa_gsod.stations` stations
ON
gsod2021.stn = stations.usaf
AND gsod2021.wban = stations.wban
WHERE
stations.country = "US"
AND gsod2021.date = "2021-12-15"
AND stations.state IS NOT NULL
AND gsod2021.max != 9999.9
ORDER BY
gsod2021.min;'

Perform log analysis

Now there are some log entries for BigQuery. You can run some queries using Log Analytics.

Task 1. Open Log Analytics

On the left side, under Logging click Log Analytics to access the feature. You should see something like the following:

If your query field is empty or you forget which table you want to use, you can click the Query button to get the sample query back.

Now you can run your own queries in the query field. Remember to replace [Your Project Id] with the project id you are using.

Task 2. To find the activities for BigQuery datasets

You can query the activities that a dataset is created or deleted:

SELECT
timestamp,
severity,
resource.type,
proto_payload.audit_log.authentication_info.principal_email,
proto_payload.audit_log.method_name,
proto_payload.audit_log.resource_name,
FROM
`[Your Project Id].global._Required._AllLogs`
WHERE
log_id = 'cloudaudit.googleapis.com/activity'
AND proto_payload.audit_log.method_name LIKE 'datasetservice%'
LIMIT
100

After run the query, you should see the output like the following:

Task 3. To find the activities for BigQuery tables

You can query the activities that a dataset is created or deleted:

SELECT
timestamp,
severity,
resource.type,
proto_payload.audit_log.authentication_info.principal_email,
proto_payload.audit_log.method_name,
proto_payload.audit_log.resource_name,
FROM
`[Your Project Id].global._Required._AllLogs`
WHERE
log_id = 'cloudaudit.googleapis.com/activity'
AND proto_payload.audit_log.method_name LIKE '%TableService%'
LIMIT
100

After run the query, you should see the output like the following:

Task 4. To view the queries completed in BigQuery

Run the following query:

SELECT
timestamp,
resource.labels.project_id,
proto_payload.audit_log.authentication_info.principal_email,
JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobConfiguration.query.query) AS query,
JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobConfiguration.query.statementType) AS statementType,
JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobStatus.error.message) AS message,
JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobStatistics.startTime) AS startTime,
JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobStatistics.endTime) AS endTime,
CAST(TIMESTAMP_DIFF( CAST(JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobStatistics.endTime) AS TIMESTAMP), CAST(JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobStatistics.startTime) AS TIMESTAMP), MILLISECOND)/1000 AS INT64) AS run_seconds,
CAST(JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobStatistics.totalProcessedBytes) AS INT64) AS totalProcessedBytes,
CAST(JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobStatistics.totalSlotMs) AS INT64) AS totalSlotMs,
JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobStatistics.referencedTables) AS tables_ref,
CAST(JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobStatistics.totalTablesProcessed) AS INT64) AS totalTablesProcessed,
CAST(JSON_VALUE(proto_payload.audit_log.service_data.jobCompletedEvent.job.jobStatistics.queryOutputRowCount) AS INT64) AS queryOutputRowCount,
severity
FROM
`[Your Project Id].global._Default._Default`
WHERE
log_id = "cloudaudit.googleapis.com/data_access"
AND proto_payload.audit_log.service_data.jobCompletedEvent IS NOT NULL
ORDER BY
startTime

After the query completes, you should see the output like the following:

Scroll through the results of the executed queries.

Task 5. To chart the query result

Instead of using a table to see the results, Log Analytics also supports creating charts for visualization. For example, to view a pie chart for the queries that have run, you can click the Chart button in the result view, select Pie chart as the chart type and query as the column. You should see a chart similar to the following:

We’ve only scratched the surface of BigQuery log analysis; you can explore many other queries and charts to enhance your understanding of BigQuery. Feel free to contribute and create samples in GCP’s sample GitHub repository.

--

--