[2024] Snowflake Shorts: Revisiting Snowflake’s Architecture

In this article, we will revisit the Snowflake Data Cloud Platform’s architecture. The article will bring great value to the readers who are preparing for Snowflake SnowPro Core and Snowflake SnowPro Advanced Certifications. The article contains key pointers as bulleted points for quick revision.

Key Concepts That You Should Be Aware Of

  • Snowflake Data Cloud is powered by an advanced data platform that provides self-managed service.
  • Snowflake Data Cloud Platform is not built on any existing database or warehousing technologies. Rather it’s a new and unique SQL query engine with an innovative architecture, designed natively for the cloud.
  • It provides all functionality of the enterprise-level analytical database services with additional features and unique capabilities.

What Makes Snowflake a Data Platform as a Self-Managed Service?

  • You do not need to manage any physical/virtual hardware or software that requires you to select, install, configure, or manage.
  • Platform’s management, maintenance, upgrades, and tuning are entirely managed by Snowflake.
  • Snowflake completely runs on the cloud infrastructure. It’s not packaged software that a user needs to install manually.
  • Snowflake can not run on a private cloud infrastructure.

Snowflake’s Architecture Design

Many times people get confused about the type of Snowflake Architecture. let’s understand it in bullet points.

  • Snowflake’s Architecture is a hybrid of traditional shared-disk and shared-nothing database architecture design.
  • As a shared-disk design, snowflake uses a central data repository to store persisted data that can be used by all the compute nodes within the Snowflake platform.
  • As a shared-nothing design, Snowflake uses MPP(Massive Parallel Processing) compute clusters for processing queries. These clusters contain nodes, that store a portion of the entire data locally.
  • This approach benefits in better data management among all compute nodes and also provides the benefit of scale-out and performance improvements for processing.

Understanding Snowflake Data Cloud Key Layers

Snowflake Cloud Layers

There are 3 key layers in Snowflake Architecture.

1. Database Storage Or Centralized Storage

  • When data comes inside the Snowflake environment, Snowflake reorganizes it into an internally optimized, compressed, columnar format.
  • The data objects stored created by Snowflake are not directly visible nor accessible by customers, they are only accessible by running SQL queries from Snowflake.

2. Query Processing Or Multi-Cluster Compute

  • Query execution is performed in this processing layer via virtual warehouses.
  • Each Virtual Warehouse is an MPP compute cluster that contains one or more compute nodes that are allocated by Snowflake via Cloud Provider.
  • Virtual Warehouses are independent compute resources and do not share their compute and local storage with other warehouses. Hence multiple warehouses do not interfere with each other and do not cause any performance hinderance.

3. Cloud Services

  • This layer is a collection of services that perform various activities across Snowflake.
  • The layer runs on compute instances provisioned by Snowflake from Cloud Provider.
  • These services include:
  1. Authentication
  2. Infrastructure Management
  3. Metadata Management
  4. Query parsing and Optimization
  5. Access Control

About Me:

Hi there! I am Divyansh Saxena

I am an experienced Cloud Data Engineer with a proven track record of success in Snowflake Data Cloud technology. Highly skilled in designing, implementing, and maintaining data pipelines, ETL workflows, and data warehousing solutions. Possessing advanced knowledge of Snowflake’s features and functionality, I am a Snowflake Data Superhero & Snowflake Snowpro Core SME. With a major career in Snowflake Data Cloud, I have a deep understanding of cloud-native data architecture and can leverage it to deliver high-performing, scalable, and secure data solutions.

Follow me on Medium for regular updates on Snowflake Best Practices and other trending topics:

Also, I am open to connecting all data enthusiasts across the globe on LinkedIn:

https://www.linkedin.com/in/divyanshsaxena/

New Ways To Stay Connected

I’ve been getting a lot of DMs for guidance, so decided to take action on it.

I’m excited to help folks out and give back to the community via Topmate. Feel free to reach out if you have any questions or just want to say hi!

https://topmate.io/divyansh_saxena11

--

--