Snowflake Summit Recap for Cybersecurity Vendors

This month’s Snowflake Summit conference was a “before and after” watershed moment for Snowflake’s role in cybersecurity. Across several recorded sessions, technical features with impressive performance metrics were demonstrated on security log data. That’s no coincidence. We are already seeing a wave of solutions re-platforming to Snowflake to simplify their backend and deliver more value to users.

If you’re building a cybersecurity product and didn’t attend the event, don’t worry: the sessions were recorded and I took notes.

What’s New: Performance and Core Engine Improvements in Snowflake

Recording link

This session featured members of Snowflake product leadership and the SVP of Engineering at HUMAN (formerly WhiteOps). First, Snowflake listed significant performance improvements that have been rolled out to boost concurrency and cut latency for ingest and queries. This progress unlocks opportunities for Snowflake to be used under the hood for new interactive use cases, including for applications that previously relied on Druid or Elasticsearch.

Highlight: Cybersecurity vendor moves their customer-facing search interface to run on Snowflake

HUMAN, makers of a popular web app security solution, then shared how they had been using a Druid database for interactive user searches and reports. They needed it for low latency fast searches, but it limited the kinds of analytics that their console could support. It also added complexity and overhead to have a dedicated database just for customer searches and reports. Snowflake’s latest advances have brought it to where HUMAN could consolidate databases and remove Druid from the mix.

As seen in their benchmarking below, this simplified architecture helps more than just the HUMAN engineering team. Their customers will enjoy better accuracy, faster searches, and more flexible filtering thanks to this change.

As an additional benefit, HUMAN’s customers can now access their data directly via Snowflake Data Marketplace. This is a valuable benefit for enterprise customers that have internal data science teams and want vendors to provide an easy path to advanced analytics.

Highlight: Indexing for fast search

Snowflake’s architecture has been unique from day one in that all data resides in cloud buckets, meaning storage is cheap and limitless. Analyzing at scale was made performant through the use of automatically managed micro-partitions but this did not translate into fast search, an important use case for many cybersecurity vendors. Customers might not be looking to crunch large datasets so much as find log records that contain a particular IP address or domain name.

Snowflake’s newly available Search Optimization service changes the game. By adding indexing, Snowflake now supports fast search within the same platform. This is an optional feature that introduces additional compute costs but does not require a separate copy of the data or management overhead. And since Snowflake’s pricing model is time-based, queries that return faster also cut costs.

As expected, point lookup queries that can take advantage of the index gain dramatic speedups. A boost of 150x is the kind of step-change that opens up new use cases and makes this feature the most exciting release from my perspective. It was also nice to see how security use cases were featured among early wins for this capability.

Additional search capabilities were highlighted which further extend Snowflake’s overlap with Elasticsearch. Support for wildcard searches in JSON records, for example, is particularly relevant to working with log data. The fast-growing XDR sector will benefit from the flexibility and scalability that this provides. Note that features in the private preview are not available by default but may be enabled on request from your Snowflake account team. As a vendor building on Snowflake, you want to stay on top of these developments so that you can bake them into your product ahead of their general availability.

The recording of this session is worth watching for the demo. Stellar production values. The engineering team used data provided (instantly via secure data sharing) by Panther to showcase an incident response scenario. In this realistic situation, a developer’s API key was leaked and billions of AWS CloudTrail log events must be searched to determine the extent of the breach. Snowflake Search Optimization delivers subsecond search times. Note that the query optimizer automatically recognizes that the index should be used for these queries so no syntax changes or special flags need to be provided by the user or application.

Search results in 0.8 seconds

Before Summit, we hadn’t seen much benchmarking for Snowflake’s search performance. This is still an area where a side-by-side comparison to Elasticsearch would be illuminating but the demo results shown below should be enough to prompt many an evaluation among cybersecurity vendors.

Using a small warehouse means price/performance gains even better than the improvement shown

Accelerating Analytical Workloads with Snowflake Performance

Recording link

This session kicked off with a recap of Snowflake’s design principles. These boil down to “it just works.” As it applies to cybersecurity solutions, Snowflake is making it possible for customers to move off of complex, expensive, and limited solutions like Elasticsearch and Presto. Anyone that’s built on those tools will appreciate the benefit of this approach:

Still, there’s a reason why Elasticsearch and other search engines are used behind the scenes for so many cybersecurity solutions. They can return results fast. So to be a viable alternative, Snowflake needed to develop some latency reducing capabilities:

These new features are all operating within the same single platform and on a single copy of data. Are they enough to power interactive user analytics in a modern B2B SaaS application? Cybersecurity vendor HYAS shared learnings from their in-depth evaluation of Snowflake for their user-facing investigation console.

The requirements that HYAS had going into their evaluation of an alternative to Postgres are common throughout the cybersecurity industry. Scale, performance, and cost would all need to align for a user experience that customers love and the CFO appreciates.

In their candid presentation, the HYAS team shared that customers were getting frustrated with reports that never finished loading. These customers were looking for a needle in the haystack as part of security investigations. Accuracy and fast results across large and varied datasets were a requirement for this product.

The application in question contains large tables storing dimensions and facts, with lots of variety and some with records constantly streaming in. Even if Snowflake could address the shortcomings of the existing backend, HYAS could only migrate if it was easy and caused minimal disruption.

This indeed proved to be the case. Migration was quick and easy, delivering big savings in cost and overhead. However, to meet performance requirements for all use cases, HYAS needed to use Snowflake’s new Search Optimization Service. The addition of indexing to the data platform boosted performance gains over the previous solution to 10x and made the initiative a big success.

While no code or data changes were needed to take advantage of Search Optimization, the engineers at HYAS added optimizations within their application to tweak their users’ searches en route to Snowflake. Watch the recording for a detailed explanation of their “timebox” optimization technique. Compared with the previous backend, Snowflake with its latest performance improvements returned results over 300x faster!

The takeaway here is that while great performance is available out of the box, further optimizations can be made based on how users tend to query the data. Searches in the HYAS Insight solution now return in a few seconds even across very large datasets. As always with Snowflake, storage at scale is extremely cheap and the whole solution is delivered as a reliable service. With these new search performance metrics, many cybersecurity vendors will no longer need to manage a separate Elasticsearch cluster for their customer-facing search interface.

Lacework: Cloud Security Powered by Snowflake

Recording link

Earlier this year, Lacework raised a $525 million Series D round to grow its cloud security and compliance business. Before it had IPO buzz, Lacework built its solution on Snowflake. Its co-founder and CTO, Vikram Kapoor, shared his vision on how quality alerts depend on collecting and processing enough data. That meant his application stack needed to support data science for accurate insights as well as historical investigations that take into account what happened months ago.

He described Lacework’s requirements in the early days that led them to select Snowflake for their backend six years ago:

  • SQL declarative approach helps in keeping change velocity high
  • Columnar data structure enables analytics at scale
  • SaaS data platform reduces overhead and frees up engineering resources
  • Cost-effective solution enables achieving margins and keeping a consistent architecture as the company scales
  • Multi-tenant design means adding new customers can be done quickly and efficiently

While Lacework did experiment with open source and cloud vendor solutions, Vikram shared that they chose Snowflake for its:

  • Independent CPU/storage scaling
  • Dynamic warehouse provisioning and scaling
  • JSON support
  • 24x7 availability with no maintenance overhead

As a result, their application can process data quickly in Snowflake while also retaining it for a long time to meet customer requirements. This is a critical point when evaluating data platforms. Operating effectively at cloud scale requires separating compute from storage. Also, the ability to scale up on the fly and without a maintenance window is a business enabler.

The session included this architecture diagram where multiple warehouses are used separately for each phase of the data lifecycle. A sharding approach was implemented to ensure that just enough compute power is available while avoiding waste and easily scaling when needed.

Vikram described three freedoms of scalability. They’re listed in the slide below and together enabled Lacework to meet SKU and SLA requirements as they’ve taken on more and larger customers. He also mentions that while Snowflake supports changes to data records, Lacework only uses “append” and they drop records automatically past a certain age depending on the customer.

What scale has Lacework achieved for its hundreds of customers? From one warehouse and under 100GB of data in its first six months of operation, they’ve achieved serious scale on Snowflake:

As Vikram describes it, they haven’t had to do much beyond adding and resizing warehouses to scale up. This is a testament to Snowflake being a data platform that “just works” at a scale that would require major engineering effort with traditional open-source data products.

Beyond meeting Lacework’s requirements at scale, Vikram mentions that he doesn’t remember the last time there was an outage or data corruption issue. I expect that reliability like this is an important property for every cybersecurity vendor and one that’s tough to evaluate without having years of experience with the product. Developer productivity is another property like that- so I’m glad that he’s sharing from a customer’s perspective.

Finally, Vikram laid out the go-to-market benefits of running on Snowflake. This is something that I’m personally focused on so it was great to hear his perspective on how Snowflake goes beyond being just a backend to helping grow the business. Lacework is increasingly seeing its customers use Snowflake as a security data lake and realizing big benefits from this architecture.

By making its data available to customers on the Data Cloud, Lacework leans into this trend and gives customers more value for use cases like incident response and self-service security dashboards. Customers use Lacework’s unique datasets like comprehensive connection tracking as an alternative to trying to piece together raw logs themselves. Cloud compliance records are also relevant for more than just the SecOps team.

Vikram shared these screenshots as examples of what customers are building in their BI tool of choice with shared Lacework data:

Vikram’s conclusions summarized both Lacework’s value to security teams and Snowflake’s value to Lacework. Security and compliance in the cloud are recognized as data problems and make data platform selection a critical decision for cybersecurity vendors.

Introducing the “Powered by Snowflake” Partner Program and BlackRock’s Journey as Founding Partner

Recording link

Finally, there were several sessions about building and scaling applications on Snowflake. Check out the Summit agenda for all the recordings but the session on the new Powered by Snowflake partner program is the big one. There are now dedicated resources available to help software vendors evaluate, design, build and scale with Snowflake. It’s a great program that also commits GTM support for help with selling into Snowflake’s 4,500 customers.

In conclusion

The new features that Snowflake released at Summit go way beyond data warehousing, covering areas that until now required separate databases and search engines with multiple copies of data and the complexity that brings. Now for the first time, fast search, data engineering, and petabyte-scale analytics are served by one cloud-native data platform.

The capabilities showcased at Summit open a window of opportunity for cybersecurity vendors to drastically simplify their architecture while delivering better user experiences. Snowflake is helping developers to seize this opportunity through the Powered by Snowflake program, which also includes significant GTM benefits. I can’t wait to see the innovations and customer success stories that our partners unleash in the cybersecurity industry!

--

--

Omer Singer
Snowflake Builders Blog: Data Engineers, App Developers, AI/ML, & Data Science

I believe that better data is the key to better security. These are personal posts that don’t represent Snowflake.