Azure Data bricks query Google Big Query
Published in
2 min readMar 23, 2021
Azure Data bricks to access big query
Use Case
- Access big query data from azure Data bricks for analytics
- Used for Data engineering
- used for Machine learning
Pre Requisite
- Azure Account
- Azure Storage account
- Azure data bricks
- GCP account
- GCP Project
- GCP big query
- Create a sample data set
- Provide permission to access and query
- Create permission JSON file and download it
Steps
- First create a Storage account
- Create a container called gcp
- Use storage explorer to create conf folder
- upload the permission json file for GCP access
- save the file service-access.json
- Now lets go to databricks to start coding
- Configure the cluster
- Let’s create notebook
val accbbstorekey = dbutils.secrets.get(scope = "allsecrects", key = "accbbstore")spark.conf.set(
"fs.azure.account.key.storagename.blob.core.windows.net",
accbbstorekey)
Mount the drive to dbfs
dbutils.fs.mount(
source = "wasbs://containername@storagename.blob.core.windows.net/conf",
mountPoint = "/mnt/gcp",
extraConfigs = Map("fs.azure.account.key.accbbstore.blob.core.windows.net" -> dbutils.secrets.get(scope = "allsecrects", key = "accbbstore")))
List and see
dbutils.fs.ls("dbfs:/mnt/gcp")
Print the env variable and check if we have the authorization json
%sh printenv
Configure the bigquery table name
val table = "projectname.dataset.tablename"
// load data from BigQuery
val df = spark.read.format("bigquery").option("table", table).load()
Display the data
display(df)
Finally unmount the dbfs
dbutils.fs.unmount("/mnt/gcp")
Samples2021/gcpbigqueryadb.md at main · balakreshnan/Samples2021 (github.com)