3 Steps to Find Access Patterns in Snowflake Using Hierarchical Queries
by Venkatesh Sekar
In my previous blog post, I demonstrated an approach to view Snowflake’s role hierarchy using visualization libraries and present it in a GUI. Although this approach was simple, I realized that a number of Snowflake customers, regardless of size, wouldn’t be able to adopt this approach due to some limiting factors:
- An expert UI/UX developer is needed for an eye-catching, convincing visualization and not all teams are necessarily equipped with a UI/UX developer
- Not all clients will allow a developer to install node, python, etc. on their laptop specifically for this purpose or there may be a significant approval process to allow installation of a “non-approved” tool
- A hosting environment is needed for the web GUI
- The timeline to develop, deliver, and host the solution on a webpage could be lengthy
- Although visually appealing, this interaction is still not the same as querying and filtering
- With a varying number of custom roles, the graph becomes complex and one has to zoom in and out, losing context now and then.
These and other considerations caused me to rethink how I could answer queries like:
- What are the different roles accessing a specific object (e.g. a schema)?
- How would I create a textual report that shows the path that looks like a Neo4j Cypher query?
- How should I filter out various access patterns where a specific custom role has access to a warehouse?
It came down to this:
In order to be more developer-friendly, is there a way that I can use SQL to answer these queries?
A Hierarchical Query Solution To Consider
As the title of this story reflects, I figured out a method to solve this with a solution using the hierarchical query clause ‘CONNECT BY’ and below I’ve referenced a Snowflake doc for you to take a look at.
Doc: Querying Hierarchical Data
For this post, I’ll walk through how you can answer the questions above using hierarchical queries to find access patterns in Snowflake using just the tried and true Snowflake Worksheet environment.
Example Access Patterns
As the saying goes, “A picture is worth a thousand words”, below is a screenshot that highlights various access patterns:
From this output record:
->ACCOUNTADMIN(ROLE)-(USAGE)->SYSADMIN(ROLE)-(USAGE)->
SANGEET_PYTHON_SERVICE_ACCOUNT_ROLE(ROLE)-(USAGE)->TEST_DB_SME_SCHEMA_ADMIN(ROLE)
-(USAGE)->TEST_DB.TEST_DB(DATABASE)
I can quickly determine the role hierarchy and the various roles which can access the database ‘TEST_DB’.
The screenshot also reflects the usage of SnowSQL to get this information.
Let’s Look at Roles and Grants
Snowflake provides the GRANTS TO ROLES views in the ‘SNOWFLAKE.ACCOUNT_USAGE’ schema. To access this schema you would need ‘IMPORTED PRIVILEGES.
Once granted access, the view can be queried. Below is a sample output of the view:
Let’s Do This
Okay, now if you are convinced, read on for a step-by-step guide:
Step 1: Create a Schema
You will need to create tables to store intermediary results and improve speed, so you need to create a schema where you can create a table.
Step 2: Cache the GRANTS_TO_ROLES Result to a Table
Based on my interactions, I find the data retrieval of GRANTS TO ROLES view is slow. This could vary across Snowflake environments due to the number of custom roles and grants that are involved or another factor such as Snowflake is building the dataset only on-demand.
I recommend capturing the result-set into a table using the query below:
CREATE OR REPLACE TRANSIENT TABLE GRANT_TO_ROLES_RS_CACHED as
SELECT *
FROM SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_ROLES
WHERE GRANTED_TO in ('ROLE','DATABASE','SCHEMA','WAREHOUSE')
AND DELETED_ON IS NULL ;
In order to use the CONNECT_BY clause with ease; I transform the records of GRANT_TO_ROLES into GRANT_TO_ROLES_RS_CACHED table as below:
CREATE OR REPLACE TRANSIENT TABLE HIERARCHY_PROJECTED AS SELECT * FROM(
WITH BASE AS(
SELECT PRIVILEGE ,GRANTED_ON ,NAME,
TABLE_CATALOG ,TABLE_SCHEMA ,GRANTED_TO,GRANTEE_NAME
FROM GRANT_TO_ROLES_RS_CACHED
WHERE GRANTED_ON IN ('ACCOUNT','ROLE','WAREHOUSE' ,'DATABASE','SCHEMA') ),PROJ AS(
SELECT
CONCAT(GRANTEE_NAME ,'(',GRANTED_TO,')') PARENT
,CONCAT( IFF( TABLE_CATALOG IS NOT NULL,TABLE_CATALOG||'.','' ) ,IFF( TABLE_SCHEMA IS NOT NULL , TABLE_SCHEMA || '.' , '' )
,NAME
,'(' ,GRANTED_ON ,')') AS OBJ
,PRIVILEGE FROM BASE
ORDER BY PARENT
)SELECT * FROM PROJ
UNION
SELECT NULL PARENT ,'ACCOUNTADMIN(ROLE)' OBJ ,NULL PRIVILEGE
);
Step 3: Execute a Query with CONNECT_BY
With the table built above, it is now easier to query using the ‘CONNECT_BY’ clause as below:
WITH ACCESS_PATH_TBL AS(
SELECT SYS_CONNECT_BY_PATH(OBJ,
IFF(PRIVILEGE IS NOT NULL,CONCAT('-(',PRIVILEGE')->'),'->')
)AS PATH,*
FROM HIERARCHY_PROJECTED
--WHERE OBJ LIKE '%(SCHEMA)%'
START WITH PARENT IS NULL
CONNECT BY PARENT = PRIOR OBJ
)SELECT PATH
FROM ACCESS_PATH_TBL
-- WHERE PATH LIKE '%RTE(ROLE) -(USAGE)%'
;
One observation I made is that the data retrieval is fast even on a Snowflake XS warehouse. You can see that the hierarchy shows all the way from the ACCOUNTADMIN role to the lowest level roles in the hierarchy.
That’s it!
That’s all it takes to create that report. You could also apply filters on the result such as:
- Where the end object in the access path is a specific schema:
Path like ‘%(USAGE)-> TEST_DB.VENKAT(SCHEMA)’
Path like '%RTE(ROLE) -(USAGE)%'
Why Consider This Approach?
If your team does not have the time or skills to develop a custom UI based approach and you need to quickly answer how a specific Snowflake object can be accessed, using the method above would save you time.
It’s fast and simple and it doesn’t require any special skillsets other than just good old SQL. It took a little less than an hour to get the result.
Try This Too
If you appreciate this approach and want to further explore hierarchical data, you could try the hierarchical query by combining GRANTS TO USERS and GRANTS TO ROLES views to answer queries like:
- What are the various roles that are granted to a user?
- What are the different roles by which a user can access a specific database object?
Final Thoughts
Although I demonstrated how to get the access pattern using CONNECT_BY, you could do more with these hierarchical query clauses. The more that I work with Snowflake’s industry-leading SaaS data platform, the more I realize that they constantly add to new service functionality daily; the more we understand those new adds, the better we will leverage the service and accelerate data-driven business outcomes.
Need Help with Your Cloud Initiatives?
If you are considering the cloud for migrating or modernizing data and analytics products and applications or if you would like help and guidance and a few best practices in delivering higher value outcomes in your existing cloud program, then please contact us.
Hashmap offers a range of enablement workshops and assessment services, cloud modernization and migration services, and consulting service packages as part of our Cloud (and Snowflake) service offerings.
Other Tools and Content for You
Feel free to share on other channels and be sure and keep up with all new content from Hashmap here.
Venkat Sekar is Regional Director for Hashmap Canada and is an architect and consultant providing Data, Cloud, IoT, and AI/ML solutions and expertise across industries with a group of innovative technologists and domain experts accelerating high-value business outcomes for our customers.