GCP — BigQuery — Data Security at rest (Part 4)

Murli Krishnan
Google Cloud - Community
4 min readNov 15, 2022

So far in the part 3 of the blog, we discussed about the aead encryption functions that are available as part of GCP.

We will be discussing more on how to use DLP and cloud KMS for encryption of column data and design on having a common utility for encryption/decryption.
The focus of this blog is to show wider and richer integration of Bigquery with other GCP services.

This blog is a part of 5 part series on “Bigquery — Data Security at rest”.
Visit here for part 4 of the series
Visit here for the series menu

Bigquery Security Controls.

Encryption using cloud KMS and DLP

In this topic, we will be discussing on creating an re-usable utility which can perform the encryption/decryption requests not only from Bigquery but also any other client (REST, python).

Cloud KMS with Bigquery

In the part 3 of the blog, we saw in great deal on how to leverage AEAD encryption powered by tink library and cloud KMS for encryption/decryption of data.

In this section, we will see another approach of having a centralized cloud function which can be used as encryption/decryption utility leveraging remote functions.

Cloud KMS used for encryption/decryption

For getting started, we will create a cloud KMS key and cloud function that can perform encryption/decryption using the key.

For the Bigquery remote functions to work, the payload needs to meet the criteria as below.

Bigquery Remote Functions (Request)

The cloud function performing encryption/decryption using symmetric keys from cloud KMS is as below

Encryption using cloud KMS
Decryption using cloud KMS

The cloud function service account should be provided with required role to access the cloud KMS keys and perform the actions

Once the function is up and running, a BigQuery connection and Remote function can be created as below

A Bigquery connection needs to be created which provides connection service account to which the CloudFunctionInvoker role needs to be provided.

bq mk --connection --display_name='encrypt_decrypt' \
--connection_type=CLOUD_RESOURCE \
--project_id=<> \
--location=us-central1 \
encrypt_decrypt_connection

Big Query Remote function can be set up as below

CREATE FUNCTION `<>.<>`.symmetric_encrypt(data string) RETURNS string
REMOTE WITH CONNECTION `<>.us-central1.encrypt_decrypt_connection`
OPTIONS (endpoint = 'https://us-central1-<>.cloudfunctions.net/encrypt_decrypt_function',
user_defined_context= [("mode","encrypt"), ("key_type","symmetric")]);

CREATE FUNCTION `<>.<>`.symmetric_decrypt(data string) RETURNS string
REMOTE WITH CONNECTION `<>.us-central1.encrypt_decrypt_connection`
OPTIONS (endpoint = 'https://us-central1-<>.cloudfunctions.net/encrypt_decrypt_function',
user_defined_context= [("mode","decrypt"), ("key_type","symmetric")]);

The “user defined context” is used to indicate if the request is to encrypt or decrypt.
This can be considered as passing arguments to a function and based on arguments, action is taken.

Once the setup is completed, the bigquery encryption/decryption can be invoked via the remote function.

select name, 
`<>`.<>.symmetric_encrypt(name) as symmetric_encrypted,
from
(select name from `<>.dataset_us_central1.basic_df_users` limit 5);
Encryption Invocation from Bigquery

Decryption function can be invoked in a similar way.

select 
`<>`.<>.symmetric_decrypt(symmetric_encrypted) as symmetric_decrypted,
from
`<>.encrypted_table`;
Decrypted data output.

The cloud function can be invoked independently as well.

Cloud Function Test
Encrypted response

Considerations
1.
Cloud KMS provides capability to perform symmetric and asymmetric encryption however if the requirement is for deterministic encryption then aead or dlp should be used.
2. Cloud KMS libraries currently do not provide batching requests, so the performance implication should be checked for bulk encryption scenarios.

Encryption using cloud DLP

The encryption using cloud DLP follows the exact same setup as cloud KMS above.

Cloud DLP used for encryption/decryption

In this setup, the cloud DLP de-identify templates are created with Format Preserving encryption.

De-Identify Template

Rest of the setup stays similar to cloud KMS with only difference being cloud function invoking DLP API for encryption/decryption

Considerations
1.
There is cost involved to invoke DLP APIs which should be considered for the setup
2. The performance of the encryption/decryption should be monitored depending on the number of requests and quota limits.
3. DLP allows batching of requests which should be taken care during the implementation.

We will be continuing with the final part — Row Level Security in the part 5 of the series.

Please connect with me on https://www.linkedin.com/in/murli-krishnan-a1319842/ for any queries.

Happy Learning.

--

--