Streamlining Data Flow: Save Web Activity to Blob Storage in Azure Data Factory

Mohsin Mukhtiar
3 min readFeb 15, 2024

With Data Factory (ADF) and Azure Synapse, managing outputs from pipelines efficiently is paramount for smooth data flow. One effective approach is to store outputs as Blob files in Azure Storage Accounts, facilitating seamless data transfer across pipelines and data flows. This method proves especially handy when dealing with data sourced from REST APIs, offering a lightweight alternative to database storage.

To begin, ensure you have a Storage Account set up. If not, refer to Microsoft’s guide for creating one. Organizing your Storage Account within the same resource group as your Data Factory simplifies cost tracking but adapt as per your needs.

Once your Storage Account is in place, create a Blob Container within it to store your blob file(s). Set the access level to private, and note down the container name for later use.

To grant Azure Data Factory write permission to your container, generate a shared access token within your Storage Account settings. Grant necessary permissions and ensure an appropriate expiry date is set. Copy the Blob SAS URL provided.

Now, let’s delve into saving data to a Blob. Suppose you have a Web Activity performing a REST API call, and you wish to save its output to a Blob.

Create a new Web activity, naming it appropriately (e.g., “Save Output to Blob”), and link it to your source activity. In its settings, set the URL to the previously copied Blob SAS URL, appending the desired blob file name after the container name.

For instance, if your URL is:

https://mystorageaccount.blob.core.windows.net/mycontainer?sp=racwdli&st=2021-11-28T23:08:09Z&se=2099-11-29T07:08:09Z&spr=https&sv=2020-08-04&sr=c&sig=xyz

Amend it to:

https://mystorageaccount.blob.core.windows.net/mycontainer/blobfilename.json?sp=racwdli&st=2021-11-28T23:08:09Z&se=2099-11-29T07:08:09Z&spr=https&sv=2020-08-04&sr=c&sig=xyz

Set the Method to PUT and include a header named “x-ms-blob-type” with a value of “BlockBlob”. The Body should be dynamically set using the value @activity(‘REST API Call’).output, replacing “REST API Call” with the name of your source activity.

By following these steps, you establish a seamless mechanism for storing and retrieving data outputs within your Azure environment, enhancing the efficiency and agility of your data workflows.

--

--

Mohsin Mukhtiar

💼 Microsoft Certified Data Engineer | 🔍 BI Developer | 📊 Power BI/DAX | 📈 Microsoft Fabric for end-to-end analytics | 🛠️ Databricks | 🐍 Python