How to retrieve BigQuery Job Details and interpreting Execution metrics.

Khushmeet Rekhi
Google Cloud - Community
7 min readJan 29, 2023

BigQuery is a fully managed enterprise data warehouse that helps you manage and analyze your data. BigQuery’s serverless architecture lets you use SQL queries to answer your organization’s biggest questions with zero infrastructure management. BigQuery is optimized to run analytic queries on large datasets, including terabytes of data in seconds and petabytes in minutes. Understanding its capabilities and how it processes queries can help you maximize your data analysis investments.

The primary unit of analysis in BigQuery is the SQL query. After the query is complete, Job history can be obtained from one of the following ways:

  • Using the Google Cloud Console
  • Using the CLI command
  • Using the API method
  • INFORMATION_SCHEMA.JOBS* views

Note: BigQuery saves a six month job history for all the jobs of a project.

To view job details, you need the bigquery.jobs.get IAM permission. The following predefined roles includes the permissions that you need in order to view job details:

  • roles/bigquery.admin (lets you view details of all the jobs in the project)
  • roles/bigquery.user (lets you view details of your jobs)
  • roles/bigquery.jobUser (lets you view details of your jobs)

Method1: Using BigQuery Console

The execution plan of the Query submitted via SQL workspace on the BigQuery console can be viewed by clicking “Execution Details” under the Query results pane.

In order to have a holistic view of the jobs you have submitted over time in your personal environment or the job executed under a particular project by a service account or different user can be obtained by navigating to the bottom of BigQuery Console as indicated below.

To locate the job executed at a certain point of time by service account in scenarios where jobs are orchestrated through Composer or third party tool, one can filter the job_id under the PROJECT HISTORY.

Now to retrieve the Query plan of the job, click on three dots under Actions and select “Open query in editor”

Once the Query is opened in the Editor, navigate to “Execution Details” to view the job execution details.

Note: The job histories include all load, export, copy, and query jobs that you submitted in the past six months (up to 1,000 entries). The limit of 1,000 jobs is cumulative across all job types.

Method2: Using CLI Command

Open the Cloud Shell, and issue bq show command with the — job=true flag and a job ID to obtain the summary information about the job.

The job ID can be a fully qualified ID or the short form. For instance, job IDs listed in the console are fully qualified, that is, they include the project and location.

To see full job details, run the command with — format=prettyjson flag

Similarly to list all the jobs using CLI, use bq ls command. Running this command requires bigquery.jobs.list permissions.

Method3: Using API Method

To view the job details, Call jobs.get and provide the jobId and projectId parameters. (Optional) Supply the location parameter and set the value to the location where the job runs.

Response of the call:

To list jobs in a project, Call jobs.list and provide the projectId parameter. To list jobs for all users, set the allUsers parameter to true. Setting allUsers to true requires bigquery.jobs.listAll permissions.

Response of the call:

Method4: Using INFORMATION_SCHEMA.JOBS* views

The INFORMATION_SCHEMA.JOBS view contains the real-time metadata for all BigQuery jobs in the current project. You need to have a BigQuery Resource Viewer (roles/bigquery.resourceViewer) IAM role to be able to query this view. The view contains currently running jobs and the job history of the past 180 days.

The INFORMATION_SCHEMA.JOBS view has the following schema, the underlying data is partitioned by the creation_time column and clustered by project_id and user_email. Use custom SQL queries based on the requirement to obtain the job execution metrics.

For example: To list all the jobs with start_time, end_time and total_slot_ms time.

SELECT

job_id, query, start_time, end_time, total_slot_ms

FROM

`region-us`.INFORMATION_SCHEMA.JOBS

Understanding BigQuery Execution Plan

To understand the lifecycle of Query in BigQuery, we first need to learn various phases of the BigQuery execution plan.

When BigQuery executes a query job, it converts the declarative SQL statements into a graph of execution, broken up into a series of query stages, which are composed of more granular execution steps.

In BigQuery, Dremel provides compute resources based on fair scheduling. When a query is submitted, clusters of slots/workers (unit of CPU, RAM, IO) extract data from storage (leaf nodes) and perform computation (mixers). Mixers perform aggregation and return the query response. Hence the execution plan is divided into stages corresponding to the leaf nodes (stage 0) and mixer nodes (Stages 1 to n).

Stages mainly has four phases:

  1. Wait: Waits either for slots to become available or previous stage completion
  2. Read: Slots read data either from Colossus or from previous stage
  3. Compute: Phase where the actual processing happens
  4. Write: Data is written either to the next stage or the result is returned back to the user

Every stage has overall per-stage overview statistics, detailed step information and stage timing classifications.

Stage Overview Statistics:

Open the Execution Graph, click on individual stage to view the execution statistics on the right overview pane:

Stage ID: is the unique ID for the stage

Status: Execution status of the stage. Possible states include PENDING, RUNNING, COMPLETE, FAILED, and CANCELLED.

Start time: Timestamp, in epoch milliseconds, that represents when the first worker within the stage began execution.

End time: Timestamp, in epoch milliseconds, that represents when the last worker completed execution.

Duration: Milliseconds elapsed since the start of query execution.

Slot ms: A cumulative representation of the slot milliseconds used by the query.

Records read: Input size of the stage as number of records, across all stage workers.

Records written: Output size of the stage as number of records, across all stage workers.

Step Details:

Steps represent the more granular operation that each worker within a stage executes, presented as an ordered list of operations.

Read: A read of one or more columns from an input table or intermediate shuffle.

Aggregate: An aggregation operation, such as GROUP BY or COUNT.

Write: A write of one or more columns to an output table or intermediate result. For HASH partitioned outputs from a stage, this also includes the columns used as the partition key.

Compute: Operations such as expression evaluation and SQL functions.

Filter: Operator implementing the WHERE, OMIT IF and HAVING clauses.

Sort: Sort or Order-By operation, includes the column keys and the sort direction.

Limit: Operator implementing the LIMIT clause.

Join: A JOIN operation, which includes the join type and the columns used.

Analytic_function: An invocation of an analytic function (also known as “window functions”).

User_defined_function: A call to a user-defined function.

Per-Stage Timing:

As each stage of execution represents work undertaken by one or more independent workers, information is provided in both average and worst-case times.

waitMs: Time the worker spent waiting to be scheduled.

readMs: Time the worker spent reading input data.

computeMs: Time the worker spent CPU bound.

writeMs: Time the worker spent writing output data.

Interpreting the key metrics from the execution plan will help identify the bottleneck that degrades the query performance, the analysis has to be done at each stage level to conclude if it’s the underlying resource contention (number of slots) or the SQL needs to be optimized further to ensure it does not follow anti-patterns.

--

--