How to use Python Notebook in BigQuery?

Parth chaudhary
Techsalo Infotech
Published in
5 min readDec 6, 2023

In this article we are going to discuss in detail about Python Notebook feature in Big Query when Entered in the big query console the “BigQuery Studio” Gives you the option to write SQL queries in the query section, we’ll see that step-by-step.

  1. You select the dataset from the local host or public dataset and after that, when Add you your data in the Explorer you can see the option of Query
  2. When you select the query it will drop down three options “In New tab”, “In the split tab” and The new feature comes in the Line called “In Python Notebook”.

We are adding a public dataset covid19_open_data for using Python Notebook in BigQuery

When you hit the Python notebook it asks you about which region you want to continue the feature you can adjust your region of the notebook for example I select the (us region). You can’t change the region for a code asset after it is created.

IAM Roles Required to run Python Notebook:

To edit and run notebooks you need to follow these IAM Roles:

. BigQuery Job User ( roles/bigquery. jobUser ) (Provides permissions to run jobs, including queries, within the project.)

. BigQuery Read Session User ( roles/bigquery. readSessionUser )(Provides the ability to create and use read sessions.)

. Notebook Runtime User ( roles/aiplatform. notebookRuntimeUser )(Grants users permissions to create runtime resources using a runtime template and manage the runtime resources they created.)

. Code Editor ( roles/dataform. codeEditor )(Edit access code resources.)

After following all the IAM Roles BigQuery, we are going to create a Notebook. We can create a notebook with options

  1. Create a notebook from a table
  2. Create a notebook from an existing notebook

In the Explorer pane, expand the Notebooks folder and if necessary the Shared Notebooks folder. Select a notebook.

Select the Activity pane.

Click more_vert View actions next to a notebook version and then click Open as new Python notebook.

A copy of the notebook is opened as a new notebook.

You can also upload the Notebook By using Vertex AI in the Google Cloud Console

“In 3 steps we can query data stored in BQ from Jupyter notebook.”

Step 1:

Before we read or query data, we need to make sure to meet a few prerequisites.

  1. You should have an active Google Cloud account(BQ Sandbox works too). You can use an existing project or create a new project.
  2. You should have Jupyter Notebook installed on your computer.
  3. You should enable Big query API either from the cloud console or from Cloud Shell.

Step 2

To request the API enabled in Step 1, you need to create a service account and get an authentication file for your Jupyter Notebook. To do so, navigate to the Credentials tab under the APIs and Services console click the Create Credentials tab, and then the Service account under the drop-down.

Credentials console under APIs & Services

Type the name and description of the service account. You may utilize the BigQuery Admin role in the project under Grant this service account access. Press “Done.” The new service account is now visible under the Credentials screen. To add an auth key, click the pencil symbol next to the newly formed service account and select Add Key. Kindly select JSON and press CREATE. The JSON file containing the auth key information will download. (It will utilize the download path for verification.)

Step 3

You are now all set to use big queries within Jupyter Notebook. Before starting to query data, install the following from the command prompt.

pip install google-cloud
pip install google-cloud-bigquery[pandas]
pip install google-cloud-storage

“Runtime is a compute resource that runs the code in your notebook.”

Connect to the Default Runtime

The default runtime is a pre-configured runtime that requires a minimal setup. It might take several minutes to connect to the default runtime if you don’t already have an active runtime.

Connect to a non-default runtime

If there is a need to use a runtime other than the default runtime, it is mandatory to create an additional runtime in Vertex AI beforehand.

Also, You can Create a new Runtime.

Take an example of a query :

Grant access to notebooks

To grant other users access to a notebook, add those users to an appropriate IAM role.

  • Code Owner: Can perform any action on the notebook, including deleting or sharing it.
  • Code Editor: Can edit the notebook.
  • Code Viewer: Can view the notebook.

Share Notebooks

To facilitate the sharing of a notebook with other users, it is possible to generate a link to the notebook and distribute it accordingly. For other users to see the notebook you share, you must first grant access to the notebook.

In conclusion, upon reading this article, you will gain knowledge about Python Notebook, which also enables the connection between Jupyter Notebook and BigQuery through function passing. This Python Notebook makes work Easier for users of BigQuery and those who want to connect another notepad to BigQuery. Additionally, you can grant access to notebooks and share them with other users.

Note: We at Techsalo Infotech are a team of engineers solving complex Data engineering and Machine learning problems. Please reach out to us at sales@techsalo.com for any query on How to build these systems at scale and in the cloud.

--

--