Implementing Tag-Based Access Control in BigQuery

Yusuke Enami(Kishishita)
4 min readSep 18, 2023

--

Photo by Kyle Glenn on Unsplash

Introduction

BigQuery is fully managed, petabyte-scale, and cost-effective analytics data warehouse that enables you to run analytics over vast amounts of data in near real time.

Data analytics often necessitates flexible data access control, especially when dealing with various data types that may include sensitive information, such as Personally Identifiable Information (PII). In this article, I will demonstrate how to use Resource Manager’s tags to implement access control in BigQuery.

Utilizing Tags in Google Cloud Resource Manager

Resource Manager’s tags enables you to create tag by Key-Value pair, and we can use these tags for access control’s condition of Identity and Access Management(IAM).

Resource Manager tags empower you to define access conditions by creating Key-Value pairs. These pairs can be employed within Identity and Access Management (IAM) to conditionally control access to resources.

For instance, to differentiate between various environments, you might establish environment as the Key and dev, stg, and prd as the corresponding Values. Similarly, if you aim to categorize different data types, you could set up dataset_type as the Key and non-pii and pii as the Values. Utilizing these Key-Value pairs allows you to fine-tune access permissions based on these conditions.

Creating Tags with Terraform

To kick things off, let’s start by creating the tags. First, we’ll focus on establishing the Key.

Note: Resource Manager Tags are managed at the organization level.

# organization/tags_tag_key.tf

resource "google_tags_tag_key" "env_key" {
parent = "organizations/${local.organization.id}"
short_name = "environment"
description = "Environment key"
}

Next, let’s generate the corresponding Value.

# organization/tags_tag_value.tf

resource "google_tags_tag_value" "dev_tag" {
parent = "tagKeys/${google_tags_tag_key.env_key.name}"
short_name = "dev"
description = "Development tag"
}

resource "google_tags_tag_value" "stg_tag" {
parent = "tagKeys/${google_tags_tag_key.env_key.name}"
short_name = "stg"
description = "Staging tag"
}

resource "google_tags_tag_value" "prd_tag" {
parent = "tagKeys/${google_tags_tag_key.env_key.name}"
short_name = "prd"
description = "Production tag"
}

After executing these steps, you can view the created Key-Value pair in the Google Cloud Console.

“IAM & Admin” -> “Tags”

Create BigQuery Dataset and Attach Tags

To create a BigQuery dataset, execute the following Terraform code:

# test-project/bigquery_dataset.tf

resource "google_bigquery_dataset" "dataset_dev" {
project = google_project.project.project_id
dataset_id = "dataset_dev"
location = "asia-northeast1"
}

resource "google_bigquery_dataset" "dataset_stg" {
project = google_project.project.project_id
dataset_id = "dataset_stg"
location = "asia-northeast1"
}

resource "google_bigquery_dataset" "dataset_prd" {
project = google_project.project.project_id
dataset_id = "dataset_prd"
location = "asia-northeast1"
}

After the dataset is ready, you can attach tags using the google_tags_location_tag_binding resource.

Note: This resource is available in the Google Beta provider.

# test-project/tags_location_tag_binding.tf

data "google_tags_tag_key" "env_key" {
parent = "organizations/${local.organization.id}"
short_name = "environment"
}

data "google_tags_tag_value" "dev_tag" {
parent = "tagKeys/${data.google_tags_tag_key.env_key.name}"
short_name = "dev"
}

data "google_tags_tag_value" "stg_tag" {
parent = "tagKeys/${data.google_tags_tag_key.env_key.name}"
short_name = "stg"
}

data "google_tags_tag_value" "prd_tag" {
parent = "tagKeys/${data.google_tags_tag_key.env_key.name}"
short_name = "prd"
}

resource "google_tags_location_tag_binding" "dev" {
provider = google-beta
parent = "//bigquery.googleapis.com/projects/${google_project.project_one.project_id}/datasets/${google_bigquery_dataset.dataset_dev.dataset_id}"
tag_value = data.google_tags_tag_value.dev_tag.id
location = "asia-northeast1"
}

resource "google_tags_location_tag_binding" "stg" {
provider = google-beta
parent = "//bigquery.googleapis.com/projects/${google_project.project_one.project_id}/datasets/${google_bigquery_dataset.dataset_stg.dataset_id}"
tag_value = data.google_tags_tag_value.stg_tag.id
location = "asia-northeast1"
}

resource "google_tags_location_tag_binding" "prd" {
provider = google-beta
parent = "//bigquery.googleapis.com/projects/${google_project.project_one.project_id}/datasets/${google_bigquery_dataset.dataset_prd.dataset_id}"
tag_value = data.google_tags_tag_value.prd_tag.id
location = "asia-northeast1"
}

You can then verify that the tag has been properly attached by viewing it in the Google Cloud Console.

dev tag is attached on the dataset_dev

Associating IAM Roles with Tags

Finally, you can associate IAM roles with the tags you’ve created. In this example, users will only be able to view datasets that have the dev tag attached.

# test-project/project_iam_member.tf

resource "google_project_iam_member" "dev_user" {
project = google_project.project_one.project_id
role = "roles/bigquery.dataViewer"
member = "user:xxx@yyy.com"
condition {
title = "only-dev"
expression = "resource.matchTag(\"${local.organization.id}/environment\", \"dev\")"
description = "Only view the dataset with tag of dev."
}
}

After applying this code, users will only be able to see the dataset_dev and won't have the ability to edit tags due to insufficient permissions.

Summary

In this article, I’ve detailed the steps for using Resource Manager tags to implement granular access control in BigQuery. Leveraging IAM with tag conditions is a powerful and invaluable tool for finely-tuning access permissions in BigQuery. I encourage you to utilize these capabilities in your own environment!

--

--

Yusuke Enami(Kishishita)

DevOps engineer in Japanese company. I love Google Cloud/Kubernetes/Machine Learning/Raspberry Pi and Workout🏋️‍♂️ https://bigface0202.github.io/portfolio/