by Venkatesh Sekar
It was an Easy Decision
After researching the market for cloud data warehouses, you have convinced your organization to make a move and adopt Snowflake. As the leading cloud data warehouse in the market operating across all 3 major cloud vendors (AWS, Azure, and GCP), delivering independent virtual compute, and offering a consumption-based pricing model, you feel great about your decision — it was actually one of the easiest ones that you’ve made in your career.
In addition, you have also identified some high-value use cases which you and your team intend to implement and deliver over the next quarter. Your team proceeds with use case development and they are able to use a lot of the great tips and quick starts as demonstrated in the “Zero to Snowflake” events and other Snowflake how-to documentation. You are really pleased that Snowflake just works and works as advertised!
Fast forward 6 months and word has gotten around on the success of Snowflake, users are very happy, and you have a number of additional use cases and analytics applications being developed very quickly in Snowflake across business units.
But Don’t Skip Over Best Practices
Upon closer inspection you find some situations where teams didn’t take advantage of best practices or techniques — below are some examples:
- New databases were created on an “as needed” basis
- Tables were created and owned by SYSADMIN or ACCOUNTADMIN
- Databases were also created and owned by SYSADMIN or ACCOUNTADMIN
- There were no clear privileges and roles defined, i.e. everyone is a SYSADMIN
You get the idea — in the rush to get new analytics applications out the door and start taking advantage of Snowflake, there were no clear best practices that were adopted.
Snowflake has provided awesome documentation, videos, tutorials, etc. The ability to use SQL for almost everything has also made it easier for developers to quickly learn and implement solutions in the service.
You realize that there was no clear cut direction provided as to where to start, what needs to be implemented first, and how to proceed in order to maximize value for the overall organization.
You might feel a bit like this…
Some of the clients that have called me in to assist are in the above situation, and I’ve assisted in reviewing the environment, conveying best practices to be implemented, and usually end up identifying, mentoring, and working with the client team on adopting a range of best practices for both the cloud overall as well as Snowflake.
But Sometimes You Are Given a Blank Slate
In a recent engagement though, I was presented with a pleasant surprise. The client gave me a blank slate, meaning a fresh Snowflake account and no migrations required, and asked me to assist them in a way that would allow the team to ease into Snowflake adoption while using best practices and a “lessons learned” approach.
This client request really became my foundation for this post. I’ll lay out for you a series of steps for adopting Snowflake in an orderly fashion with best practices that will benefit you both today and going forward with what is sure to be the inevitable growth of the service across your organization.
If you’re already down the path with Snowflake, I hope I can help you better understand a few steps that you might want to consider.
For my scenario and this post, you’ll have the chance to follow the users of a company called “Los Pollos”.
Here’s the Checklist for Snowflake Adoption on Day 1 and 2
The following is my suggested approach for Snowflake adoption with a primary focus on the first two days for purposes of this post. You want to get things right in the beginning.
Day 1 (Administration)
- Ensure accountadmin roles are defined
- Enable MFA
- Define resource monitors
- Create roles and role hierarchy
- Create sandbox users
- Assign roles to sandbox users
Day 2 (Open the Gates to Snowflake)
- Creating warehouses
- Creating databases
- Creating schemas
- Defining tables
- Defining stages
- Loading tables
I’ll briefly explain each of these steps and also point you to necessary, i.e. required reading, Snowflake documentation, which will give an even deeper perspective. As I’m reviewing the steps, I’ll also provide sample code when appropriate.
While the names used (e.g. Breaking Bad users) or even the methodology used may not follow your specific existing enterprise standards, this should give your team a starting point for using and developing in Snowflake today.
When the appropriate rules, guidelines, and integrations are properly implemented (maybe at a later date for you), your team should know what to do, how to do it, and who will do it giving you a much more predictable and governed adoption experience.
Day 1 (Administration)
Day 1 will be focused on basic administration and as such, the actors would typically be administrators.
Define the ACCOUNTADMIN Role
The account administrator (ACCOUNTADMIN) role is the most powerful role in the system. This role alone is responsible for configuring parameters at the account level. Users with the ACCOUNTADMIN role can view and operate on all objects in the account, can view and manage Snowflake billing and credit data, and can stop any running SQL statements.
Assign this role to at least 2 users
We follow strict security procedures for resetting a forgotten or lost password for users with the ACCOUNTADMIN role. These procedures can take up to two business days. Assigning the ACCOUNTADMIN role to more than one user avoids having to go through these procedures because the users can reset each other’s passwords.
- Using the accountadmin role
Below are the steps for defining an account admin role:
#1 — Walter White (existing ACCOUNTADMIN) creates a user account for Skyler (Enterprise Architect).
USE ROLE SECURITYADMIN;CREATE USER SWHITE@LOSPOLLOS.COMPASSWORD = ‘password1234~’EMAIL = ‘SWHITE@LOSPOLLOS.COM’MUST_CHANGE_PASSWORD = TRUEEXT_AUTHN_DUO = TRUEDISABLE_MFA = FALSE;GRANT ROLE ACCOUNTADMIN TO USER SWHITE@LOSPOLLOS.COM;
#2 — Walter asks Skyler to login immediately and validate that she can login. She is also forced to change her password.
Enable Multi-Factor Authentication
You definitely need to protect the account admin logins. Whether you have general Snowflake users leveraging MFA is a secondary option, but MFA should definitely be enabled for ACCOUNTADMIN.
I’ll set aside into a “technical parking lot” the discussion on which MFA method should be adopted, and will opt to start with Duo. Users do not need to separately sign up with Duo or perform any tasks, other than installing the Duo Mobile application, which is supported on multiple smartphone platforms (iOS, Android, Windows, etc.).
Go ahead and enable MFA for both accountadmin users Walter and Skyler.
Define resource monitors
If you are opening a credit card for one of your kids headed off to college for the first time, you would likely want to set some guidelines such as how much they are allowed to spend during a billing cycle on groceries. You’d probably also define monitoring rules for their account such as notifications if there is a transaction exceeding $100, the credit limit of $500 is reached, or a transaction was made across the country. Defining the rules upfront prevents accidental expenditure and tough conversations later on.
Similarly, it is better to define your thresholds in Snowflake ahead of time before opening up the gates to the users. Otherwise, you could burn through credits quicker than anticipated and be left wondering what happened at the end of the month.
Let’s define some arbitrary limits for today, knowing that we can come back on a later date to reset to appropriate levels:
USE ROLE ACCOUNTADMIN;CREATE RESOURCE MONITOR ACCOUNT_LIMIT_250WITH CREDIT_QUOTA = 250FREQUENCY = MONTHLYSTART_TIMESTAMP = IMMEDIATELYTRIGGERS ON 100 PERCENT DO NOTIFY;CREATE RESOURCE MONITOR ACCOUNT_LIMIT_300WITH CREDIT_QUOTA = 300FREQUENCY = MONTHLYSTART_TIMESTAMP = IMMEDIATELYTRIGGERS ON 100 PERCENT DO SUSPEND;CREATE RESOURCE MONITOR ACCOUNT_LIMIT_375WITH CREDIT_QUOTA = 375FREQUENCY = MONTHLYSTART_TIMESTAMP = IMMEDIATELYTRIGGERS ON 100 PERCENT DO SUSPEND_IMMEDIATE;
You can also enable notifications by following the link below.
Creating Roles and Role Hierarchy
In order for a user to access any database objects (tables, views, etc.) in Snowflake, the user should be assigned to a role. The role should have the appropriate privileges and grants (USAGE, SELECT, CREATE, etc.) to operate on the object of choice.
This is where we tend to see some missteps in the early days of getting setup. So, to keep it simple, if you are wondering what roles to create, instead of giving all the developers and analysts a SYSADMIN role, you could adopt the below roles as an initial pattern.
The below diagram shows the role hierarchy.
Get started by creating the roles:
USE ROLE SECURITYADMIN;CREATE ROLE DBA;CREATE ROLE DEVELOPER;CREATE ROLE LOADER;CREATE ROLE ANALYST;CREATE ROLE VISUALIZER;
Then define the role hierarchy:
USE ROLE SECURITYADMIN;GRANT ROLE LOADER TO ROLE DEVELOPER;GRANT ROLE DEVELOPER TO ROLE DBA;GRANT ROLE VISUALIZER TO ROLE ANALYST;GRANT ROLE ANALYST TO ROLE DBA;GRANT ROLE DBA TO ROLE SYSADMIN;
Next, give base privileges to the DBA roles:
USE ROLE SYSADMIN;GRANT CREATE DATABASE ON ACCOUNT TO ROLE DBA;GRANT CREATE WAREHOUSE ON ACCOUNT TO ROLE DBA;
Create Sandbox Users
It can take some time to work with your enterprise security team and setup that environment (AD, OKTA, etc.), but in the interim, your developers, BI analysts, and data scientists can start experimenting in Snowflake while making some progress on developing a sample use case.
Create some “play users” assigned by specific developers and proceed as intended. Once you have the AD or OKTA integration completed, the developers could be assigned to the appropriate roles and the “play users” can be dropped. You’d be able to make some progress made without causing any interruptions. This is possible since the database objects are owned by roles rather than users.
Here’s how you would create those users:
USE ROLE SECURITYADMIN;CREATE USER SGOODMANPASSWORD = ‘password1234~’EMAIL = ‘SGOODMAN@LOSPOLLOS.COM’MUST_CHANGE_PASSWORD = TRUE;
Use the same statement to create the other accounts from the user table above for the other Los Pollos users.
Assign Roles to Play Users
Here’s what you need to do to assign roles to the play users:
USE ROLE SECURITYADMIN;GRANT ROLE SECURITYADMIN TO USER SGOODMAN;GRANT ROLE DBA TO USER HSCHRADER;GRANT ROLE DEVELOPER TO USER JPINKMAN;GRANT ROLE ANALYST TO USER GGUS;
In addition to granting roles to users, take the extra step of setting the default role for these users as outlined below:
USE ROLE SECURITYADMIN;ALTER USER SGOODMAN SET DEFAULT_ROLE = SECURITYADMIN;ALTER USER HSCHRADER SET DEFAULT_ROLE = DBA;ALTER USER JPINKMAN SET DEFAULT_ROLE = DEVELOPER;ALTER USER GGUS SET DEFAULT_ROLE = ANALYST;
Simple enough! Go ahead and inform the users to login into Snowflake and change their password. After they do that, they can interact with the sample databases that Snowflake provides, review documentation, and also use the existing warehouse DEMO_WH as a starting point for issuing select against the sample databases.
This is a good stopping point for Day 1 and sets you up nicely for Day 2.
Day 2 (Open the Gates to Snowflake)
Today is about creating a warehouses, databases, and schemas and allowing users to start exploring the environment with some possible development. Most of the activities today will be done by a DBA.
Creating Snowflake Warehouses
It’s common to want to start out using Medium, Large, or Extra Large warehouses and burning up credits, even though a query or process could be accomplished with smaller sized warehouses. I highly recommend starting out smaller and then incrementally bumping up the size as needs dictate.
Start by defining some basic warehouses that your users will have access to initially. In the future, you could always create warehouses for larger sizes and assign users accordingly.
USE ROLE DBA;CREATE WAREHOUSE PLAY_XS WITHWAREHOUSE_SIZE = XSMALLMAX_CLUSTER_COUNT = 2SCALING_POLICY = ECONOMYAUTO_SUSPEND = 120ALTER RESUME = TRUECOMMENT = ‘Use this as the default warehouse for the development initially.’;CREATE WAREHOUSE PLAY_SMALL WITHWAREHOUSE_SIZE = SMALLMAX_CLUSTER_COUNT = 2SCALING_POLICY = ECONOMYAUTO_SUSPEND = 120ALTER RESUME = TRUECOMMENT = ‘Use this as the warehouse if you want to improve query performance.’;
Once the warehouse has been created, grant usage to the warehouse for the various roles:
USE ROLE DBA;GRANT USAGE, OPERATE ON WAREHOUSE PLAY_XS TO ROLE LOADER;GRANT USAGE, OPERATE ON WAREHOUSE PLAY_XS TO ROLE VISUALIZER;GRANT USAGE, OPERATE ON WAREHOUSE PLAY_SMALL TO ROLE LOADER;GRANT USAGE, OPERATE ON WAREHOUSE PLAY_SMALL TO ROLE VISUALIZER;
Since the grants were assigned to base roles, they will be inherited by DBA, DEVELOPER, ANALYST via role privilege inheritance. You should also lock down the modification of Snowflake provided warehouses.
REVOKE MODIFY ON WAREHOUSE DEMO_WH FROM ROLE PUBLIC;REVOKE MODIFY ON WAREHOUSE LOAD_WH FROM ROLE PUBLIC;
Set the warehouse default by user as below:
USE ROLE SECURITYADMIN;ALTER USER SGOODMAN SET DEFAULT_WAREHOUSE = PLAY_XS;ALTER USER HSCHRADER SET DEFAULT_WAREHOUSE = PLAY_XS;ALTER USER JPINKMAN SET DEFAULT_WAREHOUSE = PLAY_XS;ALTER USER GGUS SET DEFAULT_WAREHOUSE = PLAY_XS;
Creating Snowflake Databases
For proper adoption of Snowflake, think about defining policies on data loading, staging, zoning, environments, etc. You’ll need to start somewhere, and an initial leaping off point is provided below.
We start by defining a sandbox database, in which the users can start their initial activities. The database is purposely defined as TRANSIENT.
USE ROLE DBA;CREATE OR REPLACE TRANSIENT DATABASE SANDBOX_PLAYDB;GRANT USAGE ON DATABASE SANDBOX_PLAYDB TO ROLE LOADER;GRANT USAGE ON DATABASE SANDBOX_PLAYDB TO ROLE VISUALIZER;
Creating Snowflake Schemas
Next, define a schema where tables are created and the users start deeper exploration. Use the SQL statements below as a guide that demonstrates creating the schema and defining appropriate grants.
USE ROLE DBA;CREATE OR REPLACE TRANSIENT SCHEMA LAUNDROMAT;Define the grants as follows:GRANT USAGE ON SCHEMA LAUNDROMAT TO ROLE LOADER;GRANT USAGE ON SCHEMA LAUNDROMAT TO ROLE VISUALIZER;GRANT ALL PRIVILEGES ON FUTURE TABLES IN SCHEMA LAUNDROMAT TO ROLE LOADER;GRANT SELECT ON FUTURE TABLES IN SCHEMA LAUNDROMAT TO ROLE VISUALIZER;GRANT ALL PRIVILEGES ON FUTURE VIEWS IN SCHEMA LAUNDROMAT TO ROLE LOADER;GRANT SELECT ON FUTURE VIEWS IN SCHEMA LAUNDROMAT TO ROLE VISUALIZER;GRANT ALL PRIVILEGES ON FUTURE STAGES IN SCHEMA LAUNDROMAT TO ROLE LOADER;GRANT ALL PRIVILEGES ON FUTURE FILE FORMATS IN SCHEMA LAUNDROMAT TO ROLE LOADER;
For the rest of Day 2, I recommend focusing on the following activities:
- Using the sample data
- Creating tables
- Creating internal stages
- Loading data from stage
- Interacting with the data and getting to know Snowflake
The steps above provide some initial Day 1 and Day 2 guidance to start your journey with Snowflake, but there will be much more to accomplish and refine in the days ahead. Below are some additional suggestions for you to start exploring and adopting:
- Integration with Active Directory and Okta plus User Management. Reference: SCIM
- Defining service accounts either in Snowflake or from the enterprise Active Directory
- Defining and integrating with External Stages (Choose the cloud provider-specific details)
- Defining roles, role hierarchy, and policies Reference: security-access-control-overview
- Data Staging and Data Zoning strategy
- Defining Environments (Dev, QA, Production)
- Integrating with Data Acquisition tools
- Integrating with Data Visualization tools
- Data modeling adoption, e.g. Data Vault
- A wide range of challenging use cases and analytics applications
It would be great to hear if this helped you and what your next set of moves is with Snowflake!
Some of My Other Snowflake Stories
I hope you’ll check out some of my other recent stories also…
How to Capture Snowflake Users, Roles, and Grants Into a Table
Consider Two Options: #1 Snowflake Stored Procedures and #2 Python
Doing DevOps for Snowflake with dbt in Azure
How to Use CI/CD to Deploy Your Snowflake Database Scripts with dbt in Azure DevOps
Don’t Do Analytics Engineering in Snowflake Until You Read This (Hint: dbt)
Using dbt To Create Tables Using Custom Materialization
Feel free to share on other channels and be sure and keep up with all new content from Hashmap here.
Venkat Sekar is Regional Director for Hashmap Canada and is an architect and consultant providing Data, Cloud, IoT, and AI/ML solutions and expertise across industries with a group of innovative technologists and domain experts accelerating high-value business outcomes for our customers.