How to Build a Data App on Snowflake

Thinking of building a data app? Snowflake might be a good way to start.

Update on 7/13/2022: At the time that this was published, I was an employee at MessageGears. I now work for Snowflake.

Previously, I wrote about some principles for creating a connected / data app which includes some resources about what data apps are. Snowflake has called them “connected applications” previously, but it seems like everyone has finally settled on “data applications” (or “data apps”). If you don’t want to read that first entry, I get it, so here is a generic, one-sentence overview: data apps are applications that connect directly to your data warehouse to perform a data-intensive service. Here’s yet another detailed, and much more “real-time” focused, take from Rockset.

Benn Stancil recently wrote about what our data app future could look like in what has arguably been my favorite newsletter entry of the past six months. Essentially, he’s prophesying that Snowflake, with their purchase of Streamlit, is creating a data app framework for their eventual data app store. While that framework will likely make it easier, I don’t think you need to wait for it. Here are a couple different ways you could build a data app with Snowflake today.

Serverless Data App (Snowflake External Functions)

We always need new acronyms, right? Introducing SEFaaS, or Snowflake External Function-as-a-Service. The beautiful thing about this method for a data app is that it requires absolutely zero infrastructure as it’s powered by Snowflake and serverless functions. Just last week, AWS (Lambda) announced that they’re making this process even easier by offering Lambda Function URLs — meaning you shouldn’t have to manage API Gateway on top of your function(s). If you’re not familiar with a Snowflake External Function, here’s how it works:

Image from Snowflake Documentation. New Lambda Function URLs effectively remove AWS API Gateway.

There are a couple of start-ups who are seeing success with this model. Affinio has built several apps, primarily for AI/ML, on customer data for marketing purposes. Another is Omnata, who is creating an alternative to the Reverse-ETL approach.

While there is a low barrier to entry with the SEFaaS approach, it doesn’t necessarily mean it’s easy to develop, but it will be easy to onboard new clients to it. You’ll want to familiarize yourself with the restrictions and limitations that Snowflake has documented. Omnata has done an excellent job in regards to security — they use UDTFs to achieve full end-to-end encryption by encrypting the data before it reaches the external serverless function (i.e. AWS Lambda).

Data App via Snowflake Data Share

I’m intentionally not creating an acronym for this one — it’s not nearly as fun as SEFaaS. Snowflake Data Share should be the obvious go-to for established SaaS managed-apps already using Snowflake. With Snowflake Data Share, SaaS providers and their customers can easily share data back and forth between Snowflake accounts without (Reverse) ETLs involved.

Image also from Snowflake Documentation

In this model, the data app is both a consumer of the customer’s Snowflake data and a provider of processed data, events, and/or analytics back to the customer. It’s probably very obvious, but in case it isn’t, here’s what this could look like for a single customer / provider relationship:

As you can probably tell — I made this one!

SIEM-replacements like Hunters, Securonix, and Panther have been Snowflake’s primary examples of this model for security. In marketing, Customer Data Platforms (CDPs) like Simon Data and Zeta are Snowflake customers adopting this model to varying degrees to better serve their own customers on Snowflake.

To me, Snowflake Data Share is the best argument for SaaS providers to switch their data warehouse to Snowflake if they haven’t already. That said, like SEFaaS, you are limited to just servicing Snowflake customers as reader accounts are only one-way.

Warehouse-agnostic Data App

After you proved that you can do it with Snowflake, you might be ready to branch out to other modern data warehouses. Luckily, you can probably reuse much of what you built using SEFaaS or Snowflake Data Share (especially if your internal data warehouse is Snowflake!). For most data apps, the key will be in how you architect your solution. You’re almost certainly going to need an additional data store for staging and/or caching data to handle low-latency requests that would be too inefficient for Snowflake and other OLAP data warehouses like BigQuery and Redshift, or just simply be too expensive.

Suggested Data App Architecture

Leveraging a data lake, OLTP database, in-memory database / cache, or a headless BI service like Cube will be key, but be careful, if they’re used for persistent storage, you really don’t have much of a data app anymore, it’s just a managed application with ETLs. The difficult part: who owns the infrastructure? Is it single-tenant, multi-tenant, or a hybrid approach? Managed or self-deployed? You’ll need to tackle the answers to these to build a successful data app.

Hex has been one of my favorite tools recently and a great model for those looking to build data apps. With the power of cloud storage for caching, and DuckDB for in-memory processing, it’s made notebooks easy and fun. MessageGears (disclaimer: my employer) leverages a data lake for its Message product and an OLTP database and in-memory cache for Engage, a headless API for marketing.

What industry will you disrupt?

The market is, or will very soon be, ripe for disruption. Starting your data app now will give you a head start. Sure, data apps will be a tough sell for a prospect with dated infrastructure, but that won’t last long with the rapid progression of modern data stack adoption.

If you’re building a data app, let me know! I’d love to learn more about what you’re building and how you’re building it. I also enjoy chatting about data apps and marketing technology with the Astorik community. Come join us as we explore the modern data stack tools together.

--

--

Luke Ambrosetti
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

Partner Solutions Engineer @ Snowflake. data apps + martech. sweet tea and fried chicken connoisseur. drummer’s syndrome survivor.