Fine-grained Access Control with Permission Table in Databricks SQL

Databricks SQL SME
DBSQL SME Engineering
6 min readApr 3, 2024
Permission Table model — author screenshot

Author: Ian Liao, Specialist Solutions Architect @ Databricks

Introduction

Fine-grained access control (FGAC) is a security mechanism that allows organizations to precisely regulate who can access specific pieces of data and under what conditions. Unlike coarse-level access control, which typically rely on roles or group memberships to define access privileges for catalogs/databases/tables, FGAC enables administrators to define access policies within a table or a view based on a wide range of attributes, including user or group memberships, conditions, and data attributes.

A commonly asked question by Databricks data administrators is how to implement and efficiently operate FGAC. Common considerations include:

  • Data admins would like to view FGAC grants in a single pane of glass
  • FGAC grants can be applied to multiple data tables
  • How to minimize code change when business logic has changed
  • How to support complex business logic that includes mixture of conditions and data attributes

In this article we address this question with a design pattern called permission table. A permission table is a database table that stores FGAC grants information. It can be referenced by dynamic views, row filter functions and column mask functions — the features Databricks offers to implement FGAC. Data admins can use insert/update/delete SQL statement against a permission table to grant/modify/revoke fine-grained data access for a user or a group. We will walk through this concept using a few examples.

Related Databricks Documentation:

The Basics

Databricks offers two functions to identify a user or its role: current_user() and is_account_group_member().

current_user() returns the user name of the user who is executing a query.

SELECT current_user();
Result of running SELECT current_user()

is_account_group_member(group_name) returns true if the user belongs to the input group, and false if not. Since I am a member of the group, NA Sales Managers, it is true for me.

SELECT is_account_group_member('NA Sales Managers');
Result screenshot

On top of a country level sales table, data admins can create a dynamic view that only allows regional sales managers to have access to sales records in their region using is_account_group_member()

SELECT * FROM country_sales;
Result screenshot
CREATE OR REPLACE VIEW country_sales_rls_view AS
SELECT
*
FROM
country_sales
WHERE
CASE WHEN business_region = 'NA' THEN is_account_group_member('NA Sales Managers')
WHEN business_region = 'EMEA' THEN is_account_group_member('EMEA Sales Managers')
WHEN business_region = 'APAC' THEN is_account_group_member('APAC Sales Managers')
WHEN business_region = 'LATAM' THEN is_account_group_member('LATAM Sales Managers') END

As a member of NA Sales Managers group, I only have access to sales records for USA and Canada through the dynamic view.

SELECT * FROM country_sales_rls_view;
Result screenshot of dynamic view

Similar logic can be implemented using the row filter function as well.

Simplify and Scale with Permission Table

In the examples above, Fine Grain Access Control (FGAC) logic is hard-coded in the view definition. It is difficult to reuse and also requires refreshing view definition when making changes to the logic. Permission table solves these problems by putting Group and business_region mapping into a table and decouple it from view definition.

We can call a record in the permission table a ‘grant’, which grants a group privilege to access data in a particular region. Data admins can simply add new records in the permission table to grant privileges to groups, or delete records to revoke privileges without modifying view definition.

One permission table can be applied to multiple use cases that share the same FGAC logic, which is another great benefit that can reduce the workload of data admins.

SELECT * FROM FGAC_permission_table;
Result screenshot of the permission table

Querying the view below will return the same filtered result set as the example in the previous section.

CREATE OR REPLACE VIEW country_sales_rls_view AS
SELECT *
FROM
country_sales CS
WHERE
EXISTS(
SELECT 1
FROM
fgac_permission_table FPT
WHERE
CS.business_region = FPT.business_region
AND (is_account_group_member(group_name) OR User = current_user())
)

More Requirements & Capabilities

We can enhance the permission table and FGAC logic design to satisfy complicated requirements for real use cases. For example:

  • A user name column in the permission table can be used to grant access privilege per user alongside group based grants;
  • Country level grants can enable even finer access control;
  • Use ‘*’ as wildcard for all attributes;
  • Clearance level can be an additional requirement to meet.
SELECT * FROM FGAC_permission_table;
Updated permissions table with additional dynamic filtering

In the updated permission table, sales managers may lose access to some sales records due to clearance requirements. Record 5 grants me special access to GERMANY’s sales. Record 6 grants executives full access to the dataset.

We will need to adjust the FGAC logic accordingly as well. In this example, we will implement it using the row filter function approach.

CREATE OR REPLACE FUNCTION row_func (BusinessRegion STRING, CountryName STRING, ReqClearanceLevel INTEGER)
RETURN EXISTS(
SELECT 1 FROM FGAC_permission_table FPT
WHERE (is_account_group_member(FPT.group_name) or current_user() = FPT.user_name)
AND (FPT.business_region = BusinessRegion or FPT.business_region = '*')
AND (FPT.country = CountryName or FPT.country = '*')
AND (FPT.clearance_level >= ReqClearanceLevel)
);


CREATE OR REPLACE TABLE country_sales_w_row_func (
country STRING,
business_region STRING,
required_clearance_level INTEGER,
sales DOUBLE
)
WITH ROW FILTER row_func ON (business_region, country, required_clearance_level);


INSERT INTO country_sales_w_row_func VALUES
('USA', 'NA', '1', 100),
('Canada', 'NA', '2', 200),
('Germany', 'EMEA', '3', 300),
('Italy', 'EMEA', '0', 400),
('Japan', 'APAC', '1', 500),
('China', 'APAC', '2', 600),
('Brazil', 'LATAM', '1', 700),
('Argentina', 'LATAM', '0', 800)

Note that using the row filter function approach, read access is restricted to the target table even though I have ‘select’ access to the table, country_sales_w_row_func, and I inserted records to it.

While in the dynamic view approach, read access is restricted to the view. Users with select access to the base table, country_sales can still see all records. The best practice is to share the dynamic view country_sales_rls_view to the end users but not to share the base table country_sales.

As expected, I lost access to Canada sales due to clearance requirements but gained access to Germany sales from grant record 5.

SELECT * FROM country_sales_w_row_func;
Updated filtered tables with Row Masking Functions

Comprehensive Wrap-Up

In this article, we introduced the permission table design pattern as a means to implement fine-grained access control (FGAC) in Databricks. We explained the key benefits that the permission table brings to support FGAC, and demonstrated its design and implementation through several examples.

The Versatile Permission Table

The permission table is a highly useful design pattern that enables data administrators to easily set up and manage fine-grained access control within their Databricks environment. The use cases it can address extend far beyond the examples presented in this article.

Expanding the Functionality

The permission table design can be further enhanced to incorporate additional FGAC capabilities, such as:

  • Column-level data masking
  • Implementing both ALLOW and DENY policies
  • Supporting both AND and OR conditional logic

Invitation to Explore

I encourage you to explore the permission table design pattern and put it into practice in your own Databricks deployments. If you have any interesting use cases or insights to share, please feel free to reach out to us or leave a comment.

The permission table is a flexible and powerful tool that can help data teams establish robust access control mechanisms within their Databricks environment. By leveraging this design pattern, you can streamline the management of fine-grained permissions and unlock new possibilities for secure data governance.

--

--

Databricks SQL SME
DBSQL SME Engineering

One stop shop for all technical how-tos, demos, and best practices for building on Databricks SQL