An introduction to Snowflake

Sinch
Sinch Blog
Published in
4 min readMay 20, 2022

Year by year we can see more and more data tools being launched on the market, always with new features or a new approach to solve the same problems we are already familiar with. And as a consequence, business tends to change from one tool to another from time to time, “if there is a new tool that can solve my problems in an easier way and possibly cheaper, why not?!”

For my past few years here at Sinch I have experienced many different approaches to working with data and data tools, and used almost every cloud platform. I can list some: BigQuery, Dremio, Metabase, Redshift, PostgreSQL, MySQL, Athena, and a few more :).

Now we started working with Snowflake, looking for something more integrated and unified. While learning more and getting used to it, I decided to write and share with you a little bit about the tool/platform and how it works, so maybe you can use it there too.

It will be split into a couple of articles, lets’s get started!

What is Snowflake?

Developed in 2012, Snowflake is a fully managed SaaS (software as a service) that provides a single platform for data warehousing, data lakes, data engineering, data science, data application development, and secure sharing and consumption of real-time / shared data.

One platform, many workloads, no data silos

You will be able to:

  • Access a World of Data and Services
  • Gain Modern Data Governance and Security
  • Build and Drive Your Business Forward with Data
  • Connect locally and globally with Snowflake’s platform

Snowflake is a true SaaS offering

  • There is no hardware (virtual or physical) to select, install, configure, or manage.
  • There is virtually no software to install, configure, or manage.
  • Ongoing maintenance, management, upgrades, and tuning are handled by Snowflake.

Architecture

Snowflake uses a central data repository for persisted data that is accessible from all compute nodes in the platform.

Snowflake Architecture

Three Architectural Layers:​​

  • Storage Layer​

Hybrid Columnar, Automatic micro-partitioning, Physical data files that comprise Snowflake’s logical tables.

  • Compute (Virtual Warehouse) Layer​

Snowflake processes queries using “virtual warehouses”. Each virtual warehouse is an MPP compute cluster composed of multiple compute nodes allocated by Snowflake from a cloud provider.

  • Cloud Services Layer

The cloud services layer manages and coordinates activities across Snowflake.

Key features

  • Cloud Provider Agnostic

Is available on all three cloud providers: AWS, Azure, and GCP, while retaining the same end-user experience.

  • Time Travel

Who never had an issue updating a table without a WHERE clause? or even dropping a table by mistake? On Snowflake, these mistakes can be fixed thanks to the Time Travel feature, where it is possible to “recreate” a table considering an older version of it and also UNDROP a table.

  • Scalability

Using a multi-cluster shared data architecture that separates out the compute and storage resources enables users the ability to scale up resources if needed and scale back down when the process is finished without any interruption to service.

  • Semi-Structured Data

Many other tools struggle to deal with semi-structured data, Snowflake is able to do it utilizing a schema on read data type called VARIANT. With VARIANT the user can load semi-structured data and work/transform it into structured.

  • Security

Data is always automatically encrypted, with object-level access control, column-level security, and row access policies.

  • Data Sharing

The ability to share your data with other accounts (internal clients) or even with a new reader account (external clients) without duplicating the data and paying more for storage. The accounts that have access to the shared data will only pay for the query processing and will have the data synchronized with the main source.

Classic Console vs Snowsight

Snowflake has two different UI with common features, the more robust one is the Classic Console which has specific features (yet in 2022), and the new one Snowsight which is more user-friendly in my opinion. For most of the tasks, you can do in both, so it is up to you to decide which one you prefer.

Classic Console
Snowsight

For now, this is enough.

In subsequent parts of this article, I’ll talk about Worksheets, Tasks, Dashboards, Snowpipe, Data Marketplace, and more.

Coming soon.

References:

Welcome to Snowflake Documentation — Snowflake Documentation
Snowflake Data Cloud | Enable the Most Critical Workloads

--

--

Sinch
Sinch Blog

Follow us to stay connected to our minds and stories about technology and culture written by Sinchers! medium.com/wearesinch