This winter we switched from Redshift to Snowflake for our data analytics warehouse. Our top drivers in switching to Snowflake were:
Real-time data. Snowflake’s architecture “separates storage from compute”, meaning that reading and writing can occur in complete parallel without interfering with each other. With Snowflake, there is no performance impact if we have real-time data syncing - all data in our warehouse is current to within 30 minutes.
Handling concurrent queries. A Looker user refreshing a dashboard might generate 15–25 queries at once. Snowflake can automatically “spin up” more computing resources and run all of the concurrent queries in parallel with zero drag on the execution speed of any individual query.
Database administration. Snowflake’s knobs are directly tied to our performance needs and extremely simple. Higher concurrency? Just increase the cluster count limit. More power? Just increase the cluster size. The amount of database admin required with Snowflake is negligible.
Easily providing extra computing power to certain users: Certain data users of the company are running lots of heavier queries — we wanted to have the option of being able to easily crank the power dial specifically for them should query run time become an issue.
Note that our switch to Snowflake was coupled with introducing new data syncing vendor. We will describe our trials with this particular piece of infrastructure in later posts.
Possible Alternatives to Snowflake
The main competitor to Snowflake that was briefly considered in the switch was Google BigQuery. BigQuery is geared toward append-only paradigms; this was a non-starter given that much of our data model currently involves mutable fields.
So far, Snowflake has been everything that was promised and then some. We are quite pleased with the outcome of our switch.
In addition to having the afore-mentioned core needs fully met, we also found out that some of what we originally considered to be Snowflake “bells and whistles” are actually quite useful, especially JSON features and the “variant” data type.
Important note: No one from Snowflake was involved with this article in any way. We really wrote this simply because we are pleased with our new setup and want to share what we’ve learned.