Azure Synapse Analytics — Virtual Networks and Private Endpoints

Mariusz Kujawski
6 min readApr 26, 2023

--

In this post, I’ll describe how to secure Azure Synapse using network isolation. We have a few options to secure the Azure Synapse environment against unauthorized access. We can use Azure Active Directory to limit access to Synapse workspace and apply network isolation to prevent unauthorized access from outside of your virtual network.

Before we start I need to explain what a virtual network is and services that we will use to secure network access to the Synapse Workspace and other resources like a storage account and key vault.

The virtual network is an abstraction that provides a logical representation of your own network, but in a cloud environment. It allows you to secure your virtual resources. Azure Virtual Network (VNet) enables many types of Azure resources, such as Azure Virtual Machines (VM), Storage Accounts, Azure SQL to communicate with each other inside of VNet without public access to them. When you create a virtual network in Azure, you can define its own IP address space, Subnets. You can also connect your virtual network to other Azure virtual networks or to your on-premises network using Azure ExpressRoute or a point-to-site, site-to-site VPN connection.

Using Azure Synapse, you can process and store data. Right now it offers services like Azure Integration, Azure Synapse dedicated pool, Azure Synapse Spark Pool, and Azure Synapse Serverless pool. The presented services have different capabilities. The Dedicated pool for building big data warehouses. Spark pool to process data with Apache Spark to ingest and transform data into your data lake. Azure integration is the equivalent of Azure Data Factory but integrated into one workspace. You can use it for orchestration and data ingestion/data processing. Serverless pool queries files in your data lake using SQL.

Azure Storage Account is a serverless storage service offered in Azure that provides services like blob container, data lake gen2, file storage, Queue storage, table storage, and others. In our case, we will need a container with ADLS gen2.

Azure Key Vault is a service for storing and managing keys and secrets. This enables secure access to applications and services without exposing sensitive information.

Azure Virtual Machine (VM) is a cloud-based virtualization service offered by Microsoft Azure that allows you to create and run virtual machines in the cloud. It enables you to create and deploy a range of computing solutions, including Windows and Linux virtual machines.

Azure Private Endpoint is a network interface that connects resources such as Azure Storage Accounts, Azure SQL, VMs, to your VNet. When you create a private endpoint a private IP address is assigned to it and traffic over your network is routed over private connections bypassing the public internet. In a nutshell, private endpoints create a network interface that allows an Azure resource to connect to other resources in the network using its private IP address.

To set up the Azure Synapse environment we will need a Storage Account, VNet, VM, Synapse Workspace, and KeyVault. You can create Azure Synapse without the VNet, but it will be less secure with access to the internet.

Synapse Workspace

To create a Synapse Workspace, you can go to the Azure Portal or use a Terraform script to create and configure an entire operational environment for Synapse with networking configuration. I won’t discuss how to do it in this post, but you can find information on the internet. Using the Azure Portal, you can create a resource group and a Synapse Workspace inside of it. You can find the network security configuration in the Networking tab. To apply network isolation, we need to select the “Manage virtual network” option and disable public network access to the workspace. It’s important to note that for a Synapse Workspace, Azure will create a virtual network managed by Azure. On the screen below, you can see a question about managing a private endpoint. If you select “Yes,” Azure will create a managed private endpoint for the storage account that we chose in this network. The same process will need to be done in the Synapse Workspace configuration for all other resources that Synapse will need to access.

When you finish with Azure Synapse configuration you can open Azure Synapse Studios.

If your network configuration disables access to the public internet and you will try to connect from outside of your network you should see the error like on the screen below.

Are you wondering how to access your Synapse Studio? To access it, you will need to use a device on the same network. You can create a VM that you will use to connect to the same network or use a VNet that is linked to your on-premise network with your computer, or configure a VPN connection. For this exercise, you can build a VNet and VM that we will use to connect to our Synapse Studio. Then, we need to create a Synapse Private Link hub that will allow us to connect to Synapse Studio and private endpoints for Synapse Endpoints.

In the Synapse Workspace Private Endpoints connections tab, you can create private endpoints in the VNet. We will use these endpoints to connect to Synapse endpoints..

  1. Dedicated SQL Pool
  2. Serverless SQL Pool
  3. Dev

In Synapse Studio as you can see below you can create the required Manage Private Endpoint in the configuration tab.

We will need to create a Manage Private Endpoint for the storage account we want to query files from.

You will need to add your user to the blob storage contributor role to the storage account. When you finish all these steps you can go to Synapse Studio scripts and execute a query to test your configuration. As a result you should see a query result like in the screen below.

SELECT
TOP 100 *
FROM
OPENROWSET(
BULK 'https://testmk1234.dfs.core.windows.net/test/test.csv',
FORMAT = 'CSV',
PARSER_VERSION = '2.0',
FIELDTERMINATOR = ',' ,
HEADER_ROW = TRUE
) AS [result]

All components together should look like the screenshot below. We can see here the network we created and Synapse with Managed and Private endpoints. Additionally, you can see the storage account we use to store our data and query them using Azure Synapse Serverless.

Summary

It’s easy to create an Azure Synapse environment, but if you want to secure it as you can see it’s more difficult but critical for an enterprise to secure its data in the cloud. You need to configure network resources that allow resources to connect with each other and test the entire configuration. Another challenge is to replicate the same configuration in test, QA, and production environments, but you can easily do it with Terraform scripts.

--

--