How to call web API from an Azure Data-bricks notebook to a Delta Lake Table

Mayur Panchal
2 min readDec 26, 2019

--

BACKGROUND

My Data-bricks notebook does three things:

· Reads data from a web API

· Does some wrangling to it using the Apache Spark python API and

· Write back the final form of the data back to a Delta Lake Table

THE APPROACH THAT WORKED

The approach that worked involves writing directly to the Delta Lake table through its URL.

Below is the code snippet for writing API data directly to an Azure Delta Lake table in an Azure Data-bricks Notebook.

Code

Step 1: Add the namespace for enable the delta lake.

spark.sql(“set spart.databricks.delta.preview.enabled=true”)

spark.sql(“set spart.databricks.delta.retentionDutationCheck.preview.enabled=false”)

Step 2: import the name-space.

import json

import requests

from requests.auth import HTTPDigestAuth

import pandas as pd

Step 3: create a variable and assign the url and hit the request.

#it is free and open api for testing purpose. Anyone can use it.

url = “https://jsonplaceholder.typicode.com/todos"

myResponse = requests.get(url)

Step 4: If the api execute successful than do below operations.There using python and panda library, json response is load than using spark it is create a data fram and than store it into the delta lake table.Right it notebook is create a table automatically.

if(myResponse.ok):

jData = json.loads(myResponse.content)

s1 = json.dumps(jData)

#load data from api

x = json.loads(s1)

data = pd.read_json(json.dumps(x))

spark.sql(“DROP TABLE IF EXISTS TestTable”)

#create dataframe

spark_df = spark.createDataFrame(data)

spark_df.show()

#create a table in dbfs

spark_df.write.format(“delta”).mode(“overwrite”).saveAsTable(“TestTable”)

else:

myResponse.raise_for_status()

OutPut:

--

--

Mayur Panchal

Azure Developer,Micro-services,MVC,.net core,Web API, 1.3+ Experience as a software developer.