Sitemap
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

Best practices, tips & tricks from Snowflake experts and community

Safeguarding Nonhuman Identities: Snowflake’s Path to Workload Identity and Access Management (Workload IAM)

--

Authors:

, , Yogesh Gupta

I’m thrilled to unveil some of the initiatives Snowflake has been diligently working on to enhance the security of our application ecosystem, thereby ensuring Snowflake remains a robust and secure platform for your data and AI strategies.

As avid supporters of automation, we have heavily invested in the development and implementation of tools to simplify and optimize our team’s daily tasks. As our ecosystem of tools grew more robust and capable, it became crucial for these tools to securely access Snowflake in a controlled, compliant, and automated way. This need led us to develop a Workload Identity and Access Management (Workload IAM) strategy, which eventually resulted in our collaboration with Aembit.

As we previously blogged about, we use Aembit to secure nonhuman access to our own instance of Snowflake. Aembit’s approach of managing access without secrets inspired us to deepen our partnership. As a result, we have integrated their solution into our GitLab environment to enhance its security. This article aims to shed light on the technical aspects of this integration.

Why focus on Gitlab ?

The issue of secrets in your CI/CD platform is a significant problem. If your pipeline or infrastructure contains secrets, they can potentially be observed. This can happen due to secrets being accidentally written in logs, committed to code repositories in plain text, or through poisoned code designed to steal secrets. To address this, the vision is to replace static, long-lived secrets with short-lived, identity-based credentials that are provided just in time. This functionality is essential for teams using CI/CD tools like GitHub Actions and GitLab Jobs, as it directly impacts the safety, compliance, and efficiency of software deployment processes.

Aembit supports both 2LO (Legged OAuth) and 3LO for access using short tokens, and Aembit has a centralized access model where both 2LO and 3LO are administratively managed. So Aembit can manage access by granting short tokens, for both 2LO apps and 3LO apps, which gives Snowflake very good coverage of its app ecosystem.

Managing Gitlab Secrets without Aembit:

When Gitlab users needed access to another service, they created a request to snowflake IT cloud operations team, which created and securely delivered the necessary service account for each of these services and access credentials. These credentials could be stored in Active Directory and synced with Identity Provider.

However, this process introduced ongoing maintenance and security challenges, including the need to securely store and catalog service accounts and credentials, regularly rotate credentials, and ensure compliance through audits and visibility into the entire process.

In addition, We always lived with potential security risks:

  • long-lived access credentials being used for multiple workloads
  • the unknown workload posture.
  • highly privileged credentials were touched by multiple humans, posing a security concern

Aembit Methodology:

Aembit provides two methods of integration with our workloads, a per-application proxy approach and an API-based approach. We chose the API approach here since we have full control of the GitLab pipelines and we wanted to minimize additional infrastructure. We use the proxy approach in the previous use case I mentioned.

Below is a simple view of API based integration methodology from Aembit where your workloads can take the advantage of SDK provided by Aembit. All the auth work is handled by SDK and it doesn’t require any additional components or infrastructure.

Managing Gitlab Secrets with Aembit:

Aembit brokered identities between Gitlab on one side and a large range of workloads (GitGuardian, Beyond Identity, Jira, Snyk, Slack) it is connecting to on the other side.

The first thing that Aembit did for Snowflake is set up a trust relationship between Aembit cloud and Snowflake Gitlab instance, as well as Aembit cloud and a few other services Snowflake Gitlab instance is connecting to. This allowed Aembit to capture and share information in a highly trusted way.

When Gitlab requests access to these services,

  • Aembit is going to see that access via API, intercept that request, would request the service account token that identifies that Gitlab, send it to aembit cloud, cryptographically attest Gitlab identity with metadata information without adding a secret or credential to the workload itself.
  • It then evaluates an access policy validating Gitlab’s access to those services, with an additional layer on conditional access policies. Once the evaluation passes, Aembit issues short lived credentials on behalf of the service in question and injects them into the original request from the workflow.
  • With this, any future communication between Gitlab and other workloads is normal without Aembit cloud in between. Aembtit cloud logs all of the metadata of transaction and access request.
  • This eliminated the need for anyone including Aembit Cloud to see or store any credentials, provided much stronger security posture and also simplified the credential rotation process.

At Snowflake, the engineering team prefers using automation for all their tasks and heavily relies on GitLab pipelines. With approximately 200 intricate pipelines, each containing several policies tailored to specific requirements, the idea is to simplify governance and compliance by eliminating the need for multiple policies using Aembit. Aembit helped us achieve this by implementing a single, unified policy across all pipelines, streamlining the process and reducing complexity.

Before and After:

Forward-looking with Aembit:

Aembit not only enhances security but also automates various processes, potentially saving 5–10 hours daily in tasks such as credential issuance, management, compliance reviews, reporting, auditing, and process management as its usage expands within Snowflake.

We have started our Workload IAM journey and plan to expand its usage to other SaaS applications in our ecosystem and also cover areas like Software Supply Chain, dynamic policies, and a central system of record for workload-to-workload access. This will not only secure Snowflake internally but also provide a battle-tested solution for customers to secure their Snowflake access from workloads. Additionally, it can play a significant role in Snowflake Native Apps as they become part of the data & AI fabric.

We are excited to continue our journey with Aembit and explore its potential to enhance your experience.

--

--

No responses yet