Putting Snowflake Hybrid Tables to the Test at MarketWise Solutions

This post was written by Ron Stiffler, principal architect at MarketWise Solutions.

MarketWise Solutions is the services division of MarketWise, providing world-class shared services, including technical systems, legal, accounting, and finance, as well as human resources. MarketWise Solutions is also an avid participant in Snowflake private previews, so when Snowflake announced the private preview of Hybrid Tables, we jumped at the opportunity to join. We already had a few use cases in mind and were eager to see how Hybrid Tables would handle the challenge.

We started out slowly, kicking the tires a bit by checking out Hybrid Tables’ performance with some test tables. We decided to use a table from a different OLTP database that had 4 million records and see what it would take to lookup by ID, insert a new record, and so on, as well as exploring how long it took to perform an analytic query.

The performance impressed us and the single row operations were great. We did a lot of analytical analysis on the table as we wanted some size to play with when testing, and the results were so good that we decided to leverage Hybrid Tables for a few other projects.

Use case #1: Ingest ETL jobs

The first use case we implemented with Hybrid Tables was tracking the next start date for our ingest ETL jobs. We have processes that extract data from Snowflake and send it to other systems, and other processes that call APIs to ingest data back into Snowflake.

In both cases, we need to know the timestamp of the last edit or update. For example, on the extract side, we might say:

  • Give me all of the data where update date > <FOO>
  • Get <FOO> from what we store in this specific tracking table
  • Find new latest date from the data and update sync table

We have tried a few different methods to handle our ingest workload, most recently using DynamoDB. Our workload volume has increased over time, to the point that we now have hundreds of ingest workflows going per minute, all trying to write to the same file at the same time.

The DynamoDB solution was able to handle the concurrent writes, but we had the overhead of maintaining another database outside of Snowflake to meet our needs. We also were not getting the seamless reporting functionality we wanted, as we had to ship log files around in yet another ETL process to keep track of what we were doing in DynamoDB.

Moving the ingest ETL jobs to Hybrid Tables enabled us to reduce our infrastructure overhead and eliminate unnecessary data movement. Our team loves using Hybrid Tables for this work because we are able to report and alert on the information gathered directly through Snowflake. Plus, Hybrid Tables can handle the large volumes of writes while allowing for the analytical queries we need for reporting — and we have the ability to look up and update individual rows quickly. As of now we have been up and running with this use case for over nine months.

Use case #2: Export ETL jobs

Our next use case for Hybrid Tables involved our export ETL jobs. For this workload, we store a detailed output from file export. We call an API to trigger an automation that ingests data and feeds it back to Hybrid Tables with updates. This process allows us to have an accurate audit log of what data is leaving our system, when it leaves, and where it is going.

Previously, we did not have a robust solution to store, update, and maintain this information. We performed mostly manual logging, which prohibited later updates and made the information less actionable and valuable for us.

The takeaway: Impressive performance, simplified import/export workflows

In short, Hybrid Tables just made everything much simpler for us. We reduced our architectural footprint, centralized our data on one platform, and accomplished jobs that were previously unachievable. Overall we are very happy with the performance we have seen and are eager to continue to leverage Hybrid Tables in both our import and export ETL jobs. Beyond our current implementation, we look forward to testing out and implementing new use cases around ingesting data from external operational source systems as well as having an operational data store leveraging Hybrid Tables.

About MarketWise Solutions

MarketWise Solutions’ mission is to provide MarketWise with the systems, support and information to fuel the best-in-class subscription-based publication of financial information and software to millions of investors globally. MarketWise’s products are a trusted source for high-value financial research, education, actionable investment ideas, and investment software. MarketWise is a 100% digital, direct-to-customer company offering its research across a variety of platforms including mobile, desktops, and tablets.

--

--