Data Protection on Azure Services

Nuzhi Meyen
Geek Culture
Published in
3 min readMar 12, 2021

If you are a Data Engineer or Data Scientist who uses the Azure Cloud to build your analytics pipeline or service, you would frequently come across the problem of controlling who gets access to the resources in your pipeline and what they can do once they have access, in the case of a production deployment.

The Data Protection mechanisms in Azure are rooted in the main ideas listed below:

Identity: User or Groups who will access the service

Authentication: Verification of user or application as to proof of identity

Authorization: Privileges or access levels which allow the user or application to perform certain tasks once authenticated.

While the actual mechanisms for managing the above varies by each Azure service, they can be narrowed down to the following access management mechanisms:

Azure Active Directory (AAD) Identities : Use of Azure Active Directory to manage user, application or group identities.

Shared Access Signatures: Cryptographically secured URIs that define the permission allowed on a resource.

Shared Keys: A key name and secret value pair or a username and password as credentials.

Role-Based Access Control (RBAC): Permissions which are applied to a user or application based on their membership in a role or group.

Firewalls: IP address based access of clients attempting to access a service.

Policies: The permitted actions a user is allowed to take which could vary from POSIX-style permissions (e.g. read, write, execute) to more service- specific permissions.

The available access management mechanisms for a few of the commonly used analytics-based services on Azure are as follows :

Figure 1 — Access Management Mechanisms for Commonly Used Analytics Based Services on Azure

Apart from controlling who can access your company’s data and what they are entitled to do it, it is also necessary to consider the security of data at rest and data while in transit.

The following are a few options available for data protection at rest:

Transparent Data Encryption: For databases like Azure SQL and Synapse Analytics, the data is encrypted prior to write and decrypted prior to read and these operations are transparent to the client application.

Disk Encryption: Encryption of the disk volume or virtual hard disk.

Storage Encryption: For file stores like Azure Blob Storage and Azure Data Lake, data is encrypted prior to write and decrypted prior to read.

For data in transit the option which is available most commonly is the following:

TLS : The usage of the industry standard Transport Layer security between the client and the data store to provide secure communication across public channels like the Internet.

The available data protection mechanisms for a few of the commonly used analytics based services on Azure are as follows :

Figure 2— Data Protection Mechanisms for Commonly Used Analytics Based Services on Azure

Auditing and monitoring is also important to keep an eye on who is doing what on your data. The main audit mechanisms in Azure are as follows:

Diagnostic Logs: Diagnostic logs capture Azure Storage account operations that interact with the data managed by the service.

Activity Logs: While this features on almost all services it captures any changes made that affect the configuration of an Azure service.

Auditing and Threat Detection: This feature is specific to SQL Database and SQL Data Warehouse.

Storage Analytics Logging: Azure Blob Storage provides a specialized set of audit logs under this category.

The available audit mechanisms for a few of the above-mentioned analytics-based services on Azure are as follows :

Figure 3— Audit Mechanisms for Commonly Used Analytics-Based Services on Azure

--

--

Nuzhi Meyen
Geek Culture

Co-founder of Helios P2P. Sri Lankan. Interested in Finance, Advanced Analytics, BI, Data Visualization, Computer Science, Statistics, and Design Thinking.