Snowflake RBAC Implementation with Permifrost

Pascal Moreau
Yousign Engineering & Product

--

In a fast-growing scale-up company, the data platform is not always ready for a massive influx of new users. In this context, having a clean and scalable role management strategy is mandatory.

In this article, we will go through the implementation at Yousign of Role Based Access Control (RBAC) in Snowflake using Permifrost, an open-source project initiated by Gitlab.

How we designed our Role Based Access Control in Snowflake

Role-Based Access Control basics

RBAC (Role-Based Access Control) is a classic approach consisting of limiting access to objects via roles and permissions.

The following quotation by John Ryan (senior solution architect at Snowflake) is a good summary of what RBAC means in Snowflake.

“ To summarize, Role-Based Access Control (RBAC) is the method used by Snowflake to control access to data and compute resources. Users are granted access to Roles which in turn are granted access to Database Objects and at a minimum, you will need to be granted USAGE on the database and schema. ”

Below is a graphical summary of what an RBAC strategy should look like.

Snowflake RBAC principles, credits: John Ryan

If you are not familiar with RBAC I really encouraged the reading of this great trilogy of articles from John Ryan about RBAC on Snowflake: https://www.analytics.today/blog/introducing-snowflake-rbac https://www.analytics.today/blog/snowflake-system-defined-roles-best-practice https://www.analytics.today/blog/designing-snowflake-role-based-access-solutions

Define our RBAC strategy

This process can be long but it is an investment for the future of your data platform!

Here is how we defined our RBAC strategy at Yousign:

First, we defined a role hierarchy, and in our case, following Snowflake’s recommendations, we chose to have two types of roles:

  • Security roles: Roles giving permissions to data warehouse objects (databases, schemas, tables, warehouses…)
  • Functional roles: Roles according to rights related to a job function (Data Analyst, Analytics Engineer, Data Engineer…)

These roles are linked by hierarchical relationships: functional roles inherit one or more security roles.

Then, we matched those roles and our Snowflake objects going through the following steps:

  • List our Snowflake objects (from warehouses to schemas), their owners, and users
  • Create security roles for these objects
  • Create functional roles according to the roles in our company
  • Link security roles to functional roles

Once this strategy was defined, we looked into tools to implement it.

The granularity of permissions goes from databases to tables, however, Snowflake and the data community advise stopping at the schema level for basic use-cases RBAC strategy.

Benchmark of RBAC automation tools

The next step was to choose the implementation, we had some important requirements for the tool:

  • Code-based for version control
  • Easy to use
  • Quick implementation

The standard way of managing access is to create an SQL script with queries about roles and permissions and run it whenever a change is needed. This is the easiest solution to choose but the cons are that it is difficult to maintain and confusing because of the size of such a script.

As a result, we had a look at “Infrastructure as code” tools like Pulumi and Permifrost (more permissions as code but we will discuss below) to avoid managing accesses directly into SQL scripts.

We chose Permifrost for its fast integration and easy usage. Pulumi is really effective if you want to manage all your Snowflake infrastructure, but in our RBAC case, the ratio cost of development/usefulness was better for Permifrost.

Permifrost

Concept

Permifrost is an open-source project for managing Snowflake permissions and roles using a YAML configuration file. It is a Python package developed and open-sourced by Gitlab data team and is now supported by a great maintainers team from the community.

Permifrost has the capacity to:

  • Generate and run permissions on databases, schemas, tables, and warehouses (other objects are currently under development or are waiting for PR submissions)
  • Generate and grant ownership of objects to roles
  • Grant permissions to roles
  • Grant roles to users

The rest of this article will introduce the tool and how we use it at Yousign.

Configuration

Permifrost uses a YAML configuration file and runs queries in Snowflake to align the Snowflake permissions with the configuration.

The Permifrost YAML file consists of blocks defining:

  • Snowflake objects like databases, and warehouses (file format and integrations are under development):
  • Roles and users:

You will find all details about this configuration file on the Permifrost gitlab repository.

Usage

Installation

To be able to use Permifrost, you can follow the steps detailed below:

  • Create a user for Permifrost in Snowflake
  • In Snowflake, Choose and configure a connection method (password, OAuth, external browser SSO, key pair).
  • Install Permifrost: pip install permifrost
  • Set environment variables
  • Finally create a configuration file according to your RBAC strategy.

-> Permifrost can be included in Poetry for package management.

Run

Once the configuration file is done, run Permifrost locally with the CLI:

Permifrost will convert the YAML file to Snowflake queries as below then run them into Snowflake:

You can also run permissions only for a role or a user:

Implementations caveats

Some limitations worth mentioning (some could be mitigated in future versions):

  • Permifrost cannot create objects (users, tables etc.): we overcome this at Yousign with a short script generating CREATE queries from the YAML file of Permifrost.
  • Permifrost does not manage your Snowflake infrastructure: it does not create any databases nor warehouses, it grants them permissions. Some fields in the YAML file are informative even though they are mandatory, and they will need to be manually updated when you change something such as warehouse sizes.
  • Permifrost have a limitation with reserved keywords object names like: group, order, etc. (but why use them in the first place? 😉)

As this is an open source project, remember that all contributions are welcome!

Conclusion

Our new role management based on Permifrost is now an efficient layer of our scaling data platform. This project is also a good evangelization of database management in and outside our team.

On a personal note, it allowed me to do several open source contributions to the Permifrost project and to meet some of the maintainers, which was a very nice experience!

--

--