Approach to Data Security in Snowflake Part 1- Introduction

Published in

clouddataplatform

3 min readOct 19, 2023

Data security is a prime concern for every enterprise in their data and analytics landscape. Building data pipelines, right from consuming data from sources to making it available to data consumers, is comparatively easier than creating a data governance and security strategy for these consumers. In any data and analytics (D&A) project, the involvement of business or data domain owners is crucial when it comes to creating roadmaps, strategies, and architectures for data security. However, many blogs and articles I’ve come across barely scratch the surface on features that enable the implementation of data classification based on broad security categories that an enterprise may have. Data classification is a broad term and is highly specific to each enterprise’s use cases, determined by various factors such as scale, domains, governance, security, types of data, and how an enterprise manages the most sensitive data for their customers, such as HIPAA, PII, or GDPR-specific data candidates.

The aim of this blog series is to explain automations for Custom Data Classification. Part 1 of the series is to explain only about available features from Snowflake in enabling Data Security, most of it is referred from Snowflake Documentation. Intent is not to explain how (easy it is in Snowflake) to create TAGS and TAG-BASED MASKING policies but also how, as an data security architect, you can approach or start thinking about data security and classification based on automation. This approach focuses on avoiding the manual execution of creating SQL queries for TAGS and POLICIES, which may not work on enterprise level D&A project when you have to deal with hundreds and thousands of tables and their columns.

To understand security in Snowflake, it begins at the point when data enters the Snowflake platform. In the diagram above, I have attempted to outline Snowflake’s features and functions that can facilitate the implementation of security based on how users or applications access Snowflake.

1. User or Apps using either different connectors (https://docs.snowflake.com/en/guides-overview-connecting)

2. User or Apps are authenticated using one or more combinations like SSO/MFA including IP policies (https://docs.snowflake.com/en/guides-overview-secure)

3. Post access into Snowflake based on RBAC — Roles assignment to user / app ¸ the necessary data with masking, policies etc is made available (we will cover this architecture in the blog)

4. How different features can be applied on data engineer/analysts/applications/consumers based on TAGS, Row Level Security — RLS, Column Level Security — CLS and Data Classification.

5. Usual COLUMN level masking policies (Not topic for this blog) https://docs.snowflake.com/en/user-guide/security-column-intro

6. Usual ROW level masking policies (Not topic for this blog) https://docs.snowflake.com/en/user-guide/security-row-intro

7. Data Classification (https://docs.snowflake.com/en/user-guide/governance-classify-concepts)

a. Snowflake Data Classification

b. Classification Process

c. Custom Data Classification in Snowflake

8. Tags and Tag Based masking policies usage in Data Security — expanding the architecture with automation (https://docs.snowflake.com/en/user-guide/tag-based-masking-policies)

a. Creating TAGS

b. Snowflake System TAGS

c. Tag Based Policies

d. Regular Policies

In Part 2 and Part 3 of this series, Point 7 and Point 8 Approach with automations will be explained.

Part 2 — Snowflake and Custom Data Classification

Part 3 — Custom Data Classification using TAG automation approach

References:

Own Snowflake Data Security Architecture Experience
Snowflake Documentation https://docs.snowflake.com/
https://en.wikipedia.org/wiki/Data_classification_(data_management)

Approach to Data Security in Snowflake Part 1- Introduction

Written by Ramesh Sanap