Snowflake: Security — Framework SSFW: Encryption Layer(Part #4)

Cesar Segura
SDG Group

--

The goal of this story is to provide a better and deeper understanding of the first layers in the Snowflake Security Framework (SSFW)(Part #1). In this ocasion, we are going to see the fourth part of the Snowflake Security series — Encryption Layer.

You can also remember, the last ones on:

SSFW — Snowflake Security Framework — Encryption Layer

In this part, we are going to focus to check what capabilities Snowflake provides in order securize phisically your data information.

We will see that Snowflake provides self mechanisms to securize all that information. So it will depend on your own use case, you should have to apply some of the shown security features here or not.

Encryption

Once you have access to data physically through all the Access Layer, Snowflake also provides additional sophisticad mechanisms to store encrypted information, avoiding to exposing data for the people that have achieved to arrive here (in rest), or by default when the data is being transfered (in transit). This methodology is denominated End-to-End Encryption (E2EE).

Snowflake provides additional capabilities to manage different scenarios:

  • Ingest data encrypted by customer, after that can be un-encrypted in Snowflake
  • Customer can encrypt with a new key the data (in rest) applying different SQL techniques.
  • Snowflake encrypts automatically all the data (in rest). This can be done by self keys, customer keys or combined.

Data Physical Encryption

This set of features allows you to store the data encrypted into the database. We list basically two Phisical encryptions:

  • Ingesting client-side encrypted data (ICSED)
  • Customized Encryption

You can use one or combine them, depending of your security scenario.

> Ingesting client-side-encrypted data (ICSED)

On that part, we will highlight that we can ingest/manage encrypted data basically from two methods:

  • Encrypted Files (directly into Stages)
  • Tokenized Data

> > ICSED — Encrypted Files (into STAGE)

The scenario consist on the security when you load the files from the client side through a Snowflake stage. It will be base on the the encryption files containing a key 128/256 AES key encoded in Base64.

On Internal Stage, you can choose to automatically FULL encrypted (twice by the CSE (Client-side Encryption) and SSE (Server-side Encryption)), or only SSE. The Server side considered is the cloud service itself where your Snowflake account is hosted.

On External Stage, the type of the encryption will depend of type of cloud provider where your cloud provider stores the files.

  • AWS: Accept both CSE (with a Master Key), and SSE (S3 encrypted or KMS [Key Management Service] (accepting optionally an AWS KMS-managed key used into the bucket).
  • Google: Accept only SSE (accepting optionally an Cloud KMS-managed key used into the bucket)
  • Azure: Accept only CSE (requires a Master key in 128/256 Base64)
Example of client-side encrypted data load into Snowflake (Understanding end-to-end encryption in Snowflake | Snowflake Documentation)

> > ICSED — Tokenized Data

We have seen this part on the previous SSFW Access Layer. The responsability of the security of encryption / unencryption is delegated to a third party service. That service is outside Snowflake in a external VPC/VNET that for each request that the user need to see the data (using an External Function), receives the tokenized data from Snowflake, so 3rd party service is in charge to return De-tokenized data back to Snowflake (using Policies) in order to provide the required information to the apropropiate roles/users.

External Tokenization based on Dynamic Data Masking in Third Party Service

> Customized Encryption

Snowflake provides other options to use using some SQL functions to encrypt information. So in that case, you can store the data in a customized way applying SQL encryption transformations. This method is a good approach, if you want to open an indivual cypher channel from one role/user to other one. So both ones peers will use:

  • A reserved passphrase, in order to encrypt/decrypt this information.
  • It also includes, an extra information (AAD — additional authenticated data) can be optionally specified, in order to ease with the authentication of the person that is trying to manage the encrypted information.
  • In addition, you can use cypher mode to encrypt the messages. The current ones are: ECB, CBC, GCM, CTR, OFB and CFB. And only the PKCS padding mode is supported for ECB and CBC.

It can be optionally used a 128, 196 or 256 bit in only AES algorithm in order to apply encryption mechanisms that fits for your security needs.

That functions are basically ENCRYPT and DECRYPT (exists some variants that you can check), that use a FIPS-compliant crypographic library to ensure an effective encryption management. If you try to check monitor that statements through the QUERY_HISTORY (ie), the parameters will become unaccesible, so they will be masked for security reasons.

It is not based on role, only sharing the passphrase, optionally the AAD and the cypher mode.

Example of ENCRYPT/DECRYPT using SQL functions syntax, with runtime PP+ provided by user

Secured Management Key

All the data information REST stored in the micropartion is automatically encrypted. In this way, any user who can acess phisically to the data (by directly to the hardrive, or other way that not Snowflake passthrough methods, it will not be able to read any information.

Snowflake provides differents methods to manage the root key used to encrypt this information: Snowflake managed, Customer Managed and Trisecure.

In addition, starting at the root key Snowflake uses different keys for 4 different levels on the hierarchy access to the securable data objects, in order to encryp each one with one different key. Likes, the data is encrypted 4 times independently of other object, with different keys, providing a strong restriction access to the data.

Snowflake Managed

Snowflake uses an AES 256-bit encryption mode in all the hierarchical levels. All the keys are generated automatically by Snowflake.

All the table keys are rotated every 30 days by Snowflake Service, and only generating a new ones for the new objects are being created, so the micropartition files are automatically re-encrypted. That keys become retired, only for decrypt purposes.

The retired keys can’t be suggested to be analyzed and infered to retrieve some deterministic information by alliens. Snowflake allow the option to watch out that the retired keys don’t become older than 1 year, so it provides the functionality of period rekeying, that can be optionally activated. the process consists in destroy all that older retired keys, and generate a new ones. If you don’t activate this option, Snowflake automatically detrmines when the retired keys can be destroyed.

Snowflake relies on the each hosted Cloud provider Hardware Security Module, in order to ensure that, the key storage and usage is secure. The HSM is on charge to use on the different levels the process of decrypt/encrypt the data is being managed.

Customer Managed

Maybe that some customer would like to auto-manage the master key, in order to add an extra level of security. If the customer owns the Root master key, not even Snowflake will be able to unencrypt the data without this MK.

The MK is hosted on the same cloud provider in your account, in the specific management service that each one has:

The customer managed service is more secure but it incurs in more risk. Any uncontrolled action on this service, if this keys are lost (ie), it will do that you can’t access to your data anymore.

TriSecure

This method provides the most level of security, due to it is based to combine the both previous ones. A Snowflake root master key and a Customer master key are combined into a Composite Master Key, that wraps all of the keys in the hierarchy. The CMK is not used to encrypt data.

This service requires a Business Critical edition (or higher).

Conclusions

This layer is the last one, so this is on charge to manage the data in order to keep it safe. As we have seen, Snowflake self encrypt the data in REST and stay re-encrypted periocally with re-keying, but it provides multiple ways in order to reinforce this last phase of life-cycle of data. It will allow you not only ingest your own encrypted data, but you can encrypt by yourself. Depending on your use case, maybe you will need not only one of these methods, but you need combine them.

You can go back to the initial Snowflake Security Framework (SSFW) (Part #1). Check out the rest of the layers here:

About me

I am an SME on different Data Technologies, with 20+ years of experience in Data Adventures. An experienced Snowflake Data Jedi and Data Vault Certified Practitioner. If you want to know more in detail about the aspects seen here, or other ones, you can follow me on medium || Linked-in here.

I hope you have joined and this can help you!

--

--

Cesar Segura
SDG Group

SME @ SDG Group || Snowflake Architect || Snowflake Squad Spotlight Member || CDVP Data Vault