Optimizing continuous data pipelines for low latency using Snowpipe Streaming API in Striim: Benchmarking and Cost Analysis Summary

Snowpipe Streaming is an exciting feature that allows low-latency loads of streaming data rows while optimizing both cost and performance.

For analytics, reports, and data-driven applications (Data Apps) that need near real-time data freshness, the batch latency of traditional ETL and ELT pipelines can be too high to meet certain business requirements. Streaming systems are gradually augmenting ETL with data pipelines that deliver continuous streams of data across multiple systems resulting in low-latency delivery of data. This allows adoption of real-time analytics use cases across Retail, Banking and Finance, Healthcare, Manufacturing and Logistics.

In this blog, we partnered with Alok Pareek and the team at Striim to profile the performance and analyze the costs of Snowpipe Streaming and found the results to be very compelling.

Striim is a unified data streaming and integration platform that can be used to ingest data to Snowflake with a broad array of streaming source connectors such to Oracle, Microsoft SQLServer, MongoDB, PostgreSQL, IoT streams, Kafka, and many others.

Striim benchmarked the Snowpipe Streaming API in their new Snowflake Writer with a data pipeline that performs Oracle change data capture and replicates changes to Snowflake. The results demonstrate cost optimization of over 95% with average P95 latency of 3 seconds for high traffic tables. Striim has also created a simple click-through experience to stream data into Snowflake at high volumes with the Snowpipe Streaming API in their Striim for Snowflake product.

As you can see in the graphic below, the benchmark demonstrates significant Snowflake cost optimization when utilizing the ‘Streaming Upload’ option in Striim versus the standard file-based replication.

The streaming ingestion also showcases high levels of ingest performance at large scale and the ability to scale ingest workloads horizontally:

We’re excited to see that Ciena — a joint customer of Snowflake and Striim — shared that they will adopt additional real-time use cases given the performance of the Snowpipe Streaming API.

We chose Striim to stream real-time events from our multiple internal systems. We are pleased with our decision as it has proven to support the scale and volume of data we operate. Striim’s capabilities have been crucial in managing our data effectively.
- Sudeep Kumar, Global Head of Enterprise Data & Analytics at Ciena.

Conclusion and additional resources

While benchmarking the Snowpipe Streaming API, Striim demonstrated over 95% cost reduction with minimal latency. Snowpipe Streaming can also be combined with Dynamic Tables to support automatic incremental updates on streaming data and generate Slowly Changing Dimensions.

You can review a detailed, 31-page eBook with details of the benchmark including machine sizing, app design, and data distribution here.

You can also follow this tutorial using Striim’s free data streaming service with Snowpipe Streaming with a free trial of Striim Cloud or Striim’s ‘free forever’ Developer tier.

--

--