Building a Centralized Data Warehouse Security Framework — Pt. 1

Mulesoft Data Engineering
4 min readOct 1, 2022

--

Mulesoft Data Engineering builds and maintains a Snowflake data warehouse for our analytics teams and business partners to help them unlock deeper insights surrounding our customers, sales, and financials. Due to the nature of this data, it is imperative that best security practices are applied to ensure teams and users aren’t exposed to sensitive data outside their jurisdiction. We are a small, agile team and we do not have a dedicated database administrator (DBA) for these tasks. In the first part of the article, we will first talk about creating an interface and entrypoint for users to submit requests.

Self-Service for Users

We have developed an internal web application for users to submit a variety of intake requests. Upon user submission, requests are logged and a data approver is notified. The data approver may then approve or reject the request. A request approval will trigger custom provisioning processes based on the type of request. This involves executing the provisioning SQL and updating our metadata control tables (more on those in later parts). A process diagram is shown below:

We are leveraging Streamlit as a self-service application connecting to Snowflake. The internal tool we have built has capabilities including but not limited to:

  • SSO User Authentication
  • Onboarding a new Snowflake user
  • Snowflake Object Catalog
  • Requesting access to Snowflake objects
  • New Team Onboarding
  • New Service Account Onboarding
  • Warehouse Resizing Request
  • Data Steward Approval Workflow

Below are a few workflow screenshots of the application from the user perspective.

The user authenticates in and chooses to request access to a Snowflake object. They are greeted by the following screen:

Because the user authenticated via SSO, their user information is already filled in and cross-referenced. They may choose to select access to an object (table, database, schema, column) or request access to a team or user role (more on that in the below section).

Once they request access to an object, the application will pull all roles assigned to their user and a list of databases, schemas, and tables that dynamically populate based on user choice. The user can then choose to request access to any object for any role that is assigned to them.

In the screenshot below, USER_SUMESHSHARMA is a unique user role assigned to the user, and they would like access to all tables in a particular schema in the Marketing database. They can also review their request details below:

This tells the user what they are asking for in layman’s terms.

Once the user submits, the request is stored and an approver gets a notification via email.

The user can then go to a different page in the application to view the status of the request:

Note that DECISION is blank as an approver has not yet made a decision on the request yet.

If the request is approved, the user will get an email saying their request has been approved and they can access their tables. All requests and approvals are logged for audit purposes.

This concludes Part 1 of building out a centralized security framework for our data warehouse. Here, we explored the user-facing layer of our framework, and showed how to leverage Streamlit to provide a seamless experience for user requests. In Part 2, we will be diving into our RBAC Standards and policies and how they tie into the security framework.

--

--