Snowflake: Security — Framework SSFW: Access Layer (Part #3)

Cesar Segura
SDG Group

--

The goal of this story is to provide a better and deeper understanding of the first layers in the Snowflake Security Framework (SSFW). This is the third part of the Snowflake Security series — Access Layer (read Part 1 and Part 2 here).

SSFW — Snowflake Security Framework — Access Layer

In this part, we are going to focus on the different security components that are composed of accessibility to the objects that contain the data. We are going to navigate through the different stack levels, following the order showed on above image. It will give us an overview what are the aspects that we have to consider in our data platform, in order to securize each one of the security components that Snowflake offers us.

It will depend on our own use case if we have to apply some of the security features here shown or not.

Access

This layer is the next step once all the connections have been accepted, and we have to manage the accessibility of the different components to the verified users on our data platform. This layer can be also known by IAM, Identity Access Management.

Users / Role Management

In Snowflake you can provision manually to your own Users from your company (or develop your own tool to do so), but it is really recommended to use an SCIM identity provider standard.

The use of SCIM capability in Snowflake allows you to synchronize your enterprise hierarchy users directly into it. So you won’t have to specify one-by-one each user, allowing from your internal identity provider manage your users. Although the known supported ones are OKTA and Microsoft Azure AD, you can also use other customer ones.

Since the interactions are API-based, you can use the following use cases:

  • Manage Users: Synchronize Enterprise Users
  • Manage Roles: Synchronize Enterprise Roles
  • Audit activity: If you want to monitor all the interactions of synchronization between your Snowflake account and your identity provider.

In order to manage easily the setup, improving the security options we will be able to use security integrations. Explained on Integration, in this section.

Authentication

This is one of the most important sections, so we will pay attention in different aspects to consider in our security platform.

Authentication Methods

We will list the below list, on that order of preference as Snowflake recommends, with a brief explanation of what are the best-recommended uses of security terms:

  • OAuth (Snowflake or External): Redirection of users to an authorization page to generate access tokens (can refresh tokens) to be used to access Snowflake. Can be used by Snowflake native, or use a third external service to do it. Best recommended for enterprise users.
  • External Browser: To access from client external applications, open up a browser prompt that allows you to authenticate in a safest way. It is not designated for automation or service tools. Best recommended for enterprise users, if you don’t accept OAuth.
  • Okta native: When you delegate authorization access to an entity provider in order to access the service provider (Snowflake), in that case you make use of the federation authentication security. In Snowflake the Okta native methods are OKTA (hosted service) and AD FS (Active directory Federation services). On the other hand, Snowflake supports most SAML providers (non-native), which are Google G Suite, Microsoft Azure Active Directory, One Login, and Ping Identity PingOne.
  • Key pair: Public and private key access keys. Recommended for service applications. Snowpipes only accepts this.
  • Password: Snowflake self-managed credentials.

As you can see, Snowflake provides different methods of authentication, but it will depend if you access from your service application, client snowsql, etc… what will be in your case the best scenario of use. For example, as mentioned, not always you can use an External browser, so for the service application case wouldn’t be the best scenario.

The SSO method is the most recommended way to authenticate into Snowflake, so in this way only can be used on SAML, OAuth, External Browser, and Okta native authentication methods. Try to avoid user/password scenarios.

Password policies

In the case we use Snowflake Password Authentication method (user / password), Snowflake provides Password policies. These policies allow to manage of certain rules about how the password can be set in this format and maintained (frequency of renewing the password, ie). Password policies only apply to Account or User level.

Authentication policies

Depending on the way you are accessing Snowflake, these are the methods you can use:

You will have to take into consideration the previous order of preference established on each method, in order to choose the best one based on the platform is running your connections.

It is important to mention, that user can login with multiple methods. Sometimes this option is very helpful in order to provide a good SLA service on the authentication service.

But depending on the use case, maybe you are in a scenario where you want to manage who will be able to use depending on what method, so Snowflake provides the authentication policies that allow/deny the use of the authentication methods. This AP feature is currently in PP (Public Preview).

MFA (Multi-Factor Authentication)

In order to maximize the authentication security in your company, Snowflake provides the capability to use this verification method powered by DUO service.

Approved by App, Call or Passcode options! (Multi-factor authentication (MFA))

MFA is not enabled by default, the user must enroll to enable it. However, it is highly recommended to at least for your ACCOUNTADMIN. In best practice, it should exist at least, two account admin (MFA enrolled), in order to act in determined scenarios recover or restore. MFA is compatible with most authentication methods, including the use of connectors. For limitations/best practices check the docs.

SECRET

Snowflake allow us the use of this security object, which is mainly used to store determined credentials internally for authentication purposes. Depending on the use we want to do to our secret, we will have determined characteristics (we will see in other sections). This object allows us not to specify manually our credentials to some objects directly. So, once you have created your SECRET specifying the credentials, you only have to assign your SECRET to your object. This object will have access to SECRET, you will only allow the usage to determine role hierarchy, or use it in specified authentication methods.

Sessions

When you have authenticated successfully into Snowflake, the connection gets a session generated by the service provider (Snowflake) that is the channel where you will interact directly.

This session remains active with the activity of the user, but depending on the inactivity, having many sessions opened can be affect to the security and performance of the warehouse you are using. So the policies for Sessions are trying to manage this type of behavior, controling what sessions may rest opened, or not.

We can use session policies on Account or User Level. If we apply different policies on different level, affecting the same user, the user the session User policy takes preference.

By default, if no policy is applied, the default iddle timeout session is ( 4 hours). Session policies can be managed when you are using the below options to access Snowflake:

Authorization Objects

Once you have an active session established, we have to authorize the visibility of determined securable objects to the user that is trying to access to.

On this section we will find the below features we will take into consideration, in order to reinforce our security:

  • Control Access methods
  • Account hierarchy roles
  • Custom hierarchy roles
  • Securable Objects
  • Control privileges
  • Managed Schemas
  • Package policies

Control Access methods

Snowflake combines two methods of applying privileges to securable objects to users.

  • DAC (Discretionary Access Control): One object has an owner that can apply privileges to it.
  • RBAC (Role Base Access Control): Access privileges apply to roles that can be granted to users.

We will take into consideration the following roles:

  • Account Roles: Allow action on any securable object (including Custom Roles).
  • Database Roles: Allow action only on Database scope objects.
  • Instance Role: Allow access only to a class.

On Snowflake, you will have to define the different roles, establish your hierarchy roles, and apply privileges on securable objects to different roles.

Account hierarchy roles

In all accounts, you will have that hierarchy established by default:

The blue roles are the ACCOUNT ones, and the grey ones will be where you will start building your Custom hierarchy role (you will see in the next section).

A brief description of each one:

  • Orgadmin: Manage operations at the organizational level, like creating accounts. It can view usage information across all the accounts.
  • Accountadmin: Inherit Security Admin and System Admin roles. They have access to objects other than not do these roles, like the Budget.
  • Securityadmin: Create roles, and users and manage all privileges globally in the account.
  • Useradmin: Create roles and users in the account.
  • Sysadmin: Dedicated to Warehouses and Databases (and other objects inside it).
  • Public: By default, every user is assigned to this role.
  • Custom roles: Take into consideration that all customized roles must be granted to one other role. This fact will avoid the scenario that this role can’t be unmanaged by other roles. By default, it will be SYSADMIN for DATA access purposes.

Custom hierarchy roles

In every use case, we will have to apply Access privileges to one role named Access Role. We generate other Functional Roles that make sense to specific funcionalities establishing a hierarchy role. Every one role can inherit other role privileges. Once this hierarchy role is done, we will apply specific Functional Roles to one or multiple functional users.

These roles are created under account roles but with a limited scope of securable objects. Here we have an example of RBAC hierarchy:

Role Based Access Control in hierarchical view

Securable objects

All snowflake objects can be securized, so depending of the type we will have to apply some extra permissions or these ones doesn’t apply. If there some object that at the bottom level, we will have to apply privileges to give the correct usage on it.

This is a graphical view of the hierarchy of all securable objects:

Control privileges

We have a full list of permissions, that will apply depending on the securable objects and the operations we want to apply. A wide catalog means a deeply control of every action. You can follow this reference for the details.

Managed Schema

When you create an schema in your database, this is a regular schema. This means in security terminology, that the role has created the object on this schema is the owner, having the ability to grant. But for some use cases, this not a good scenario, in that case we have the option to create Managed schema, that by default only some specific roles like Schema Owner can apply privileges.

Packages policies

In Snowflake exists a wide broadband of programming languages you can use. As a part of this eco-system, Anaconda Channel is used to provide Python packages out-of-the box into Snowflake. In order to use Conda, you will have to accept previously the terms, that you have to read carefully.

Depending on the region, sector or internal culture / methodology of the company, the access to the use of some of these packages must be managed. So here, the Package policies interact with the goal of allow / block the access to these packages not only selecting a list of libraries, but you can also manage the version of each one. This policies are defined on account level.

Masking Policies

Once you have access with your role to the different securable objects needed in order to access to the data, we can allow to see the data fields information dynamically such as partially, fully or any of the Sensitive information.

These strategies make sense to apply Data Governance rules in your Snowflake Dataplatform. To achieve that, we have different methods to manage PII data information through Masking policies. The masking policies allow managing that behavior without any change to the data.

We can basically define different methods to apply data masking, depending on the strategy to manage the different policies and rules applied to our PII fields.

These are the ones below:

  • Dynamic Data Masking
  • Tag-Based Masking
  • External Tokenization
  • Aggregation

Dynamic Data Masking

This will be based on applying policies directly to tables/columns in a database, so inside this policy, we manage access to this value for the role that we are trying to get.

Dynamic Data Masking

Tag-based Masking

The process is similar to Dynamic Data Masking, but it differs in that we don’t have to apply a policy to every table/column. We can use Tags that we can apply on the different level account/schema/table/column. So we apply directly the masking policies to the different tags, so in a centralized way, it will automatically apply the masked data where tags applies. We take the advantage we can monitor the tags, as well.

Data Masking applying Tag-Based Masking

External Tokenization

It is based on cipher the data information previously to load into Snowflake (Tokenization) using dynamic data masking but with external functions (outside Snowfke), so when you request to select this data it produces the unciphering information (DeTokenization). Part of the security resides on a third-party service, and can affect to the performance for large tables.

The policies applied on this case, can be applied by Tag Based (mentioned previously).

External Tokenization based on Dynamic Data Masking in Third Party Service

Aggregation

The aggregation policies verify the type of queries that can be done on the table, in that case must be aggregated so you can’t try to filter individual records. This method try to ensure that some roles can’t identify individual personal data through retrieving some data. So, this is very helpful if you can allow generate and show some useful information, so you don’t incur into a risk managing the PII data, to wide broadband stakeholders.

This is an example with an aggregation policy minimum of 2 rows:

Access Policies

At the contrary on column Dynamic Masking policies, the access policies are applied directly on a specific securable object level. In these cases, the policies will check if you can access specific information or not.

These ones are the below:

  • Row Access (RLS)
  • Projection
  • Secure definition on objects

Row Access Policy

The Row Access Policies apply on tables / views (including one or more fields). A RLS (row level security) consists on filtering on query time, with the use of an expression, the information is being retrieved by the user. This expression checks in a table mapping that contains the role and the field value associated that are allowed the access. In your expression, you can optionally don’t use this table mapping, if you know that the logic to allow the access is more easy than a complex hierarchy accepted specific area/roles. Take into consideration, more complex is your expression/calculation to check the allowance of the visibility of the information, the worse will be your performance. So the last scenario is recommended due to you will get a better performance but the first one will get you a better flexibility.

Role Access Policies based on Table Role — Mapping

Projection Policy

Projection policy (PP) is the method of the security that allows to hide the column from the queries have done into the table that contains, allowing or hidding for some specified roles. The policy avoid that any non authorized user will be able to insert that information on other table.

The PP is a good candidate to consider if you want apply binary dynamic data masking on columns, see or not see data. This is an example of a projection policy applied to the PII column Salary of one table. Only Susan that can use the Role 2, will access to Salary column.

Secure definition of objects

Although you can use some objects, sometimes some of them can contains in their structure some sensitive data. In that case, you won’t want that this information can be exposed. So Snowflake provides an extra security access feature that consists into Securize the Data Definition of that objects.

Any NOT allowed role won’t be able to see that definition, at least that explicitily grant the proper role/privilege to this.

There are different securizable objects:

Integration

That part of the integration will cover the logical part that Snowflake manage in order to allow / deny the Accessibility to some parts of that resources once your connection has been accepted. It is the complement to the part that was on the Network Layer (that allows/deny access to specific connections).

  • STORAGE: You can specify the access/block to specific paths to your cloud external storage. This method allow us to control, in our External Stage, where you can access or not.
  • API: You can specify the access/block to specific external functions/api using prefixes that resides in the allowed cloud providers. So, when you use your External functions, you will control if you can use determined Azure function, Lambda,…
  • CATALOG: You can specify the access/block to the CATALOG where you can access, in your external provider.
  • EXTERNAL ACCESS: You can specify the access/block to specific external network location directly from your handlers. Here we have different methods to allow/block the access: allowed to use API integrations, and allowed SECRET.
  • SECURITY: On this layer, we will be able to use different security features depending on the service capabilities used on (User/Role Management and Authentication). We can see that security features that will depend on the Security Component and its use method:
SSF: ACCESS LAYER — Integration — Security Features

Secure Data Sharing

Introduction to Secure Data Sharing | Snowflake Documentation

Snowflake provides a secure way to share data between Snowflake accounts (Organizations), or other non-Snowflake customers. This method is using Data Shares, which basically consists of sharing a database (and the objects allowed into it) directly into other accounts like a Database SHARE object. This method doesn’t copy any data, so the data shared is almost immediately visible in the Database shared in the other account.

The only data objects you can share are the below ones (we highlight the Snowflake recommendations versus the other ones):

  • Databases
  • Tables
  • Dynamic tables
  • External tables
  • Iceberg tables
  • Secure View (Recommended)
  • Secure materialized views (Recommended)
  • Secure user-defined functions (UDFs) (Recommended)

There are different types of data sharing:

  • Direct Share: You can share database objects (share) to another account in the same region and cloud provider. One shared account.
  • Listing: You can offer a share and other metadata as a product to one o more accounts. They can be located in other regions / cloud providers.
  • Data Echange: You manage a group of accounts, that they are allowed to read data, provide data or both.

The method to share these object to third party access, is providing a READER account, that can only be used to read the data on this account.

Conclusions

I think the Access layer is the most complex Layer we face on security terms, but it is one of the most important. Access to the data can be managed in multiple ways, so it will depend on Governance rules, IT requirements, and Data Platform complexity in order to manage the different Data Models into our Snowflake Account and sharing with others. Snowflake has a full list of security capabilities, and each capability has multiple security features that support the security access to the data at the end. You can go back to the initial Snowflake Security Framework (SSFW) (Part 1), or visit the previous layers:

We will see you soon in the next chapter SSFW — Encryption Layer.

About me

I am an SME on different Data Technologies, with 20+ years of experience in Data Adventures. An experienced Snowflake Data Jedi and Data Vault Certified Practitioner. If you want to know more in detail about the aspects seen here, or other ones, you can follow me on medium || Linked-in here.

I hope you have joined and this can help you!

--

--

Cesar Segura
SDG Group

SME @ SDG Group || Snowflake Architect || Snowflake Squad Spotlight Member || CDVP Data Vault