Using Snowflake on Azure for Querying Azure Event Hubs Capture Avro Files

Arsen Vladimirskiy
Dec 27, 2018 · 4 min read

In this video, we look at how to use Snowflake on Azure to query Avro files generated by Azure Event Hubs Capture feature. In our example, we’ll create Azure Blob Storage account and configure Azure Event Grid to send blob creation and deletion events to an Azure Event Hub and Azure Storage Queue simultaneously, and then use Snowflake on Azure to parse and query the Avro files generated by the capture.

Video Walkthrough

Tip: Play the video full screen.

Table of Contents

00:00 Beginning of video
01:58 Create resource group, storage account for files, and storage queue
03:15 View storage account in Microsoft Storage Explorer
04:15 Create Azure Event Hubs Namespace and Event Hub
05:50 Create Event Grid subscription blob2queue using Azure CLI
06:45 Create Event Grid subscription blob2eventhub Azure Portal
08:40 Uploading files to Azure Blob Storage using Azure CLI upload-batch
10:40 Viewing messages in the queue
12:10 Using Snowflake on Azure worksheet
13:28 Create Snowflake stage pointing to the Azure Blob Storage container
17:20 Querying number of records in Avro files
20:40 Decoding message body
23:00 Inserting decoded data into Snowflake table
24:50 Querying captured events using Snowflake JSON parsing capability
27:45 Comparing events captured in queue with ones captured in Event Hub
29:30 Grouping query to view events by minute
30:25 Deleting files and querying deletion events
34:20 Comparing total number of events in queue vs Event Hub after deleting all files

Create Storage Account For Files and Queue

# Create resource group
az group create -n avehc1 -l eastus2

Create Azure Event Hub

# Create event hub namespace
az eventhubs namespace create -g avehc1 -n avehc1ns -l eastus2 --sku Standard

Create Event Grid Subscription

az eventgrid event-subscription create --resource-id /subscriptions/SUBSCRIPTION_ID/resourceGroups/avehc1/providers/Microsoft.Storage/storageAccounts/avmyfiles1 --name blob2queue --endpoint-type storagequeue --endpoint /subscriptions/SUSBSCRIPTION_ID/resourceGroups/avehc1/providers/Microsoft.Storage/storageAccounts/avmyfiles1/queueservices/default/queues/queue1

Upload Batch of Files to Generate Events

az storage blob upload-batch --account-name avmyfiles1 --destination myfiles --source /mnt/c/Python36 --pattern "*.*"

Snowflake Queries

use database TEST_DB;

Thank you!

Please leave feedback and questions below or on Twitter https://twitter.com/ArsenVlad


Originally published at blogs.msdn.microsoft.com on December 27, 2018.

Arsen Vladimirskiy

Written by

Azure Cloud Architect & Software Engineer at Microsoft, Commercial Software Engineering (CSE) Team