HOW TO LOAD DATA INTO GOOGLE BIG QUERY FROM PYTHON PANDAS

Hemant Anand Khandare
3 min readMar 9, 2022

--

We will explore the python pandas and start to interact with Google Bigquery to ingest and query data.

As usual, to interact with the GCP service we need to set up the authentication first using the service account JSON key.

For more details on how to create the service account and download the JSON key please refer to the below link.

  1. Open the python IDE, I prefer the Jupiter Notebook.
  2. Install the pandas-gbq library.

3. Set the authentication with GCP using the service account.

4. Import the gbq library.

import pandas_gbq

5. Set the project id.

project_id = “rising-cable-340114”

6. Store the SQL statement inside the variable.

sql = “””
SELECT * FROM `rising-cable-340114.mydataset1.emp1`
“””

7. execute the command.

df = pandas_gbq.read_gbq(sql, project_id=project_id)

8. Print the dataframe to see the output.

9. Load the data frame output to the bigquery table.

Please note the if_exists parameter=”replace” ,
This would re-create the new table every time the statement is executed.
There is another option “append” which would just append the new data to the table.

10. Check the table in bigquery.

Additional:

You may read the CSV file using panda and convert it to a data frame and that data frame can be dumped into the bigquery.

eg: #read CSV file to convert it to a data frame.

Github link for code reference:

Thanks for your time!!!

Enjoy your day!!!

--

--