Azure Data bricks query Google Big Query

Balamurugan Balakreshnan
Analytics Vidhya
Published in
2 min readMar 23, 2021

Azure Data bricks to access big query

Use Case

  • Access big query data from azure Data bricks for analytics
  • Used for Data engineering
  • used for Machine learning

Pre Requisite

  • Azure Account
  • Azure Storage account
  • Azure data bricks
  • GCP account
  • GCP Project
  • GCP big query
  • Create a sample data set
  • Provide permission to access and query
  • Create permission JSON file and download it

Steps

  • First create a Storage account
  • Create a container called gcp
  • Use storage explorer to create conf folder
  • upload the permission json file for GCP access
  • save the file service-access.json
  • Now lets go to databricks to start coding
  • Configure the cluster
  • Let’s create notebook
val accbbstorekey = dbutils.secrets.get(scope = "allsecrects", key = "accbbstore")spark.conf.set(
"fs.azure.account.key.storagename.blob.core.windows.net",
accbbstorekey)

Mount the drive to dbfs

dbutils.fs.mount(
source = "wasbs://containername@storagename.blob.core.windows.net/conf",
mountPoint = "/mnt/gcp",
extraConfigs = Map("fs.azure.account.key.accbbstore.blob.core.windows.net" -> dbutils.secrets.get(scope = "allsecrects", key = "accbbstore")))

List and see

dbutils.fs.ls("dbfs:/mnt/gcp")

Print the env variable and check if we have the authorization json

%sh printenv

Configure the bigquery table name

val table = "projectname.dataset.tablename"
// load data from BigQuery
val df = spark.read.format("bigquery").option("table", table).load()

Display the data

display(df)

Finally unmount the dbfs

dbutils.fs.unmount("/mnt/gcp")

Samples2021/gcpbigqueryadb.md at main · balakreshnan/Samples2021 (github.com)

--

--