Crossing the Streams: Snowflake Unistore

Duncan Beeby
9 min readJul 25, 2022

--

I’m Duncan. I’m a Senior Sales Engineer at Snowflake. Opinions expressed here are solely my own and do not represent the views or opinions of my employer.

This week I’m off New York. It’s been a couple of years since I’ve been to America. For me, this looks like a great sign that the world is starting to get back on its feet from COVID-19.

Now it just so happens that New York is where one of my favourite films is set. Ghostbusters.

For the uninitiated, Ghostbusters is about three scientists Peter Venkman, Ray Stantz and Egon Spengler work at Columbia University. where they delve into the paranormal and fiddle with many unethical experiments on their students. As they are kicked out of the University, they really understand the paranormal and go into business for themselves. Under the new business name of ‘Ghostbusters’, and living in the old firehouse building they work out of, they are called to rid New York City of paranormal phenomenon at everyone’s whim. — for a price.

They make national press as the media reports the Ghostbusters are the cause of it all. Thrown in jail by the EPA, the mayor takes a chance and calls on them to help save the city. Unbeknownst to all, a long dead Gozer worshiper (Evo Shandor) erected a downtown apartment building which is the cause of all the paranormal activity. They find out the building could resurrect the ancient Hittite god, Gozer, and bring an end to all of humanity. Who are you gonna call to stop this terrible world-ending menace?

Crossing the streams

If there’s one constant within the Ghostbusters universe, it’s that you should never cross the streams. Why? Well, it would be bad. How bad? Well, try to imagine all life as you know it stopping instantaneously and every molecule in your body exploding at the speed of light. That’s bad.

Egon Spengler: There’s something very important I forgot to tell you.
Peter Venkman: What?
Spengler: Don’t cross the streams.
Venkman: Why?
Spengler: It would be bad.
Venkman: I’m fuzzy on the whole good/bad thing. What do you mean, “bad”?
Spengler: Try to imagine all life as you know it stopping instantaneously and every molecule in your body exploding at the speed of light.
Ray Stantz: Total protonic reversal!
Venkman: Right. That’s bad. Okay. All right. Important safety tip. Thanks, Egon.

In some ways, this reminds me of what’s happened in the past when database vendors/customers have tried to cross the Analytical and Transactional streams. Its never ended well, until now. Let me explain why.

The goal of running transactions and analytics on the same data has been around for decades, but has not fully been realised due to technology limitations. Today, businesses can no longer afford to miss the real-time insights from data that is in their transactional system as they may lose competitive edge unless business decisions are made on latest data.

As a result, in recent years there’s been an effort to address this problem by designing techniques that combine the transactional and analytical capabilities and integrate them in a single hybrid transactional and analytical processing (HTAP) system

Before data can be put to use, it must be processed. Online analytical. processing (OLAP) and online transactional processing (OLTP) are the two primary data processing systems used in data science.

  • OLAP is designed to analyse multiple data dimensions at once, helping teams better understand the complex relationships in their data. This system is ideal for uncovering valuable business insights. The data structures used are optimised for storing and accessing large volumes of data to be transferred between the storage layer (disk or memory) and the processing layer (e.g., CPUs, GPUs, FPGAs). Analytical DBMSs store data in column stores with fast scanning capability that is gradually eliminating the need for maintaining indexes. Furthermore to keep high performance and to avoid the overhead of concurrency, these systems only batch updates at predetermined intervals. This, however, limits the data freshness visible to analytical queries.
  • OLTP is a simple transactional system ideal for handling online transactions at scale.

Therefore, given the distinct properties of transactional data and (stale) analytical data, most enterprises have opted for a solution that separates the management of transactional and analytical data.

In such a setup, analytics is performed as part of a specialised decision support system (DSS) in isolated data warehouses. The DSS executes complex long running queries on data at rest and the updates from the transactional database are propagated via an expensive and slow extract-transform-load (ETL) process.

The ETL process transforms data from transactional-friendly to analytics-friendly layout, indexes the data, and materialises selected pre-aggregations. Today’s industry requirements are in conflict with such a design. Applications want to interface with fewer systems and try to avoid the burden of moving the transactional data with the expensive ETL process to the analytical warehouse. Furthermore, systems try to reduce the amount of data replication and the cost that it brings. More importantly, enterprises want to improve data freshness and perform analytics on operational data.

Ideally, systems should enable applications to immediately react on facts and trends learned by posing an analytical query within the same transactional request. In short, the choice of separating OLTP and OLAP is becoming obsolete in light of exponential increase in data volume and velocity and the necessity to enrich the enterprise to operate based on real-time insights. This is where Snowflake’s Unistore capability steps in.

Snowflake Unistore

For decades, transactional and analytical data have remained separate, significantly limiting how fast organizations could evolve their businesses

Unistore is a new workload that delivers a modern approach to working with transactional and analytical data together in a single platform. Unistore was created for many reasons.

Our customers are tired of moving data between their systems. They no longer want to manage redundant datasets across multiple solutions. They want to access data when they need, and be able to work with virtually all their data in one place.

But Unistore’s impact is far more significant than unifying data. Teams can now build transactional business applications directly on Snowflake, run real-time analytical queries on their transactional data, and get a consistent approach to governance and security.

Snowflake customers such as Adobe, UiPath, IQVIA, Novartis, and Wolt are all early adopters of Unistore. We’ve seen them use Unistore for use cases such as storing application state for pipelines, handling data serving or powering online feature stores, and even backing enterprise transactional applications. Early feedback has been excellent and our customers are excited that Snowflake can now support these transactional use cases. Customers are eager and ready to take advantage of the many Unistore benefits, including:

  • A single dataset to power the future of modern development

Act on transactional data almost immediately, build better customer experiences, and get new insights by integrating transactional and analytical data in a single dataset.

  • Simple and streamlined transactional app development on Snowflake

Build enterprise transactional apps and more with the same simplicity, performance, and ease you expect from Snowflake’s Data Cloud.

  • Consolidated transactional and analytical systems

Simplify architectures and standardise security and governance controls on a single platform, while eliminating the need to move or copy data.

Hybrid Tables for transactional use cases

Hybrid Tables (currently in private preview) are a new Snowflake table type powering Unistore. A key design principle when Snowflake built Hybrid Tables was the need to support the most common transactional capabilities that application developers have come to rely on. Clearly, performance is a critical aspect of any transactional application, especially for fast single-row operations. To support that, we’ve developed an entirely new row-based storage engine so enterprise transactional applications can now be built directly on Snowflake.

Getting started with Hybrid Tables is easy. Simply create a table the same way you would any other traditional Snowflake table. But to support these transactional workloads, Hybrid Tables require a primary key and Snowflake will now enforce the uniqueness of your applications’ primary keys.

Analytics on transactional data

While the ability to build transactional applications directly on Snowflake is exciting in its own right, the power of Unistore doesn’t stop there. Unistore unlocks the full potential of your data by enabling you to perform analytics directly on your transactional data.

This data holds a lot of value for analytics, which is why we see many of our customers load their transactional data into Snowflake from external databases in order to mine business insights. But you can get even more powerful insights if you run analytics on transactions as they happen. Imagine having an orders table with a billion records instantly incorporated into a dashboard that reports weekly sales trends.

Using Hybrid Tables, you simply run the analytical query directly on your transactional data and the results are returned with the analytical performance you’d expect from Snowflake.

In addition to powering analytical queries directly on your transactional data, Hybrid Tables allow you to break down data silos between your transactional and historic data.

You can join Hybrid Tables with your other data that’s already in Snowflake — existing Snowflake tables, data from the Snowflake Marketplace, or data shared from other teams. For example, you can overlay your orders data with information from existing marketing campaigns, all without having to move any data between systems.

All in the Data Cloud

Perhaps the most important feature of Unistore is that it’s one of a number of workloads powered by Snowflake’s Data Cloud. This means you can reap certain benefits such as:

  • Consistently enforced data governance and security controls across your data
  • True cloud performance at scale with Snowflake’s elastic performance engine
  • No need to manage infrastructure, query tuning, updates, or data continuity with the simplicity of Snowflake
  • Seamless integration with data shared across clouds and regions, without having to copy or move data

Snowflake provides this level of integration and consistency by reducing the number of concepts to learn, technologies to deal with, and knobs to turn. That’s the true Snowflake way.

Hybrid Tables are just the beginning of what Unistore will become. Snowflake has continually delivered many innovations over the previous years, and now the day has come when it’s possible for organisations to use a single platform for both their transactional and analytical data. It’s just a matter of time when acquiring insights that were once difficult, or even unthinkable, will become mainstay. So please join us in building the unthinkable with Unistore.

In summary

In the film, everything told the crew of the Ghostbusters not to cross the streams. It had been tried before and almost ripped a hole in the fabric of space. So when new thinking and technology comes along and changes what we know to be true, it’s worth taking that opportunity. Crossing the streams saved New York and the world as we know it! Choosing Snowflake and its Unistore capabilities could just do the same for you!

Recap:

Hybrid Tables for transactional use cases

Hybrid Tables, currently in private preview, are a new Snowflake table type to power Unistore with fast, single-row operations. That means teams can build transactional business apps directly on Snowflake with:

  • Required primary keys and unique constraints enforced
  • Indexes for accelerated lookup
  • Primary key and foreign key relationships with referential integrity constraints

Analytics directly on transactional data

Run fast analytical queries on transactional and historical data for immediate context:

  • Get powerful insights by running analytics on transactions as they happen
  • Merge data — from existing Snowflake tables, data from the Snowflake Marketplace, or data shared from other teams — for actionable, near real-time insights
  • Run analytical scans directly on data for holistic, end-to-end views

All in the Data Cloud

The greatest feature of Unistore? It’s all in Snowflake. Which means:

  • Consistently enforce data governance and security controls
  • True cloud performance at scale with Snowflake’s elastic performance engine
  • No need to manage infrastructure, query tuning, updates, or data continuity with the simplicity of Snowflake
  • Seamless integration with data shared across clouds and regions, without having to copy or move data

For more information on Snowflake’s cool new capability, check this page out:

Next Month:

Join me for next month’s blog on Value Per Credit — Snowflake, the Value Broker.

Duncan!

--

--