The Year Security Data Lakes Go Mainstream

Omer Singer
Digital Diplomacy
Published in
5 min readJan 4, 2021

Last January, I predicted that 2020 would be The Year of the Security Platform. Did that end up happening? And what should we expect for 2021?

2020: Many integrations, few actual platforms

The year was packed with partnership announcements aimed at consolidating security solutions into a cohesive stack. For example, Obsidian announced that CrowdStrike alerts would be available natively in its solution for combined visibility across SaaS and endpoints.

AWS, probably the most influential tech platform, showed what the future of observability platforms might look like with their Managed Service for Grafana. Unlike its hostile hosting of Elasticsearch (with Elastic screaming bloody murder), AWS took a collaborative approach this time. Grafana gets paid for its software while customers enjoy an embedded experience within AWS.

Did any of the major cybersecurity vendors break through with one platform to rule them all? It doesn’t appear so. Palo Alto Networks, for example, downplayed its Data Lake and Hub products in favor of its proprietary Extended Detection and Response (XDR) offering.

A true platform is like an operating system that empowers independent software vendors to succeed in new and bigger ways. Think Windows or the iPhone. We didn’t see the cyber industry deliver that last year.

Snowflake emerged as a security platform

In a year full of exciting developments for Snowflake, from a record-breaking IPO to launching the Data Cloud, its newfound relevance to security teams was remarkable. As early adopters successfully rolled out security data lake projects, nimble vendors noticed and embraced the new architecture.

Customer highlights from the past year include:

To accelerate security analytics at these customers and others, Snowflake invested (figuratively and literally) in partnerships with a new generation of security vendors. These partners delivered their solution on top of the customer’s own data platform, where all security data could be centralized and used for a variety of use cases.

With dozens of security data lake customers and a vibrant ecosystem, Snowflake finished the year as a proven security platform. What does this mean for the future of the security industry?

2021: Security Data Lakes Go Mainstream

A single data layer with virtually unlimited scale and analytics power. Today, that’s the reality for around 1% of security teams across all Snowflake customers. The ratio is likely similar at companies where BigQuery, Synapse or Redshift/Athena serve as the enterprise data platform. Compare that fraction to the vast majority of security teams that rely on a standalone SIEM for log aggregation. While this arrangement has been the way things are done for a decade or two, in 2021 best practices will shift dramatically.

What might bring security data lakes from the domain of elite security engineers to SOCs everywhere?

The need will be bigger than ever. The proliferation of niche solutions for rapidly multiplying use cases like Cloud Infrastructure Entitlements Management and Zero Trust Network Access will place a premium on having a consolidated data layer to support effective detection, response and metrics. There’s also the challenge of sorting through terabytes of new log data generated as Covid and WFH accelerate cloud migration schedules by years.

Meanwhile, fierce competition in the data platform space will drive speedy innovation that coincidentally closes the gap between generic data platforms and cyber solutions like LogRhythm and QRadar. When Redshift announces support for semistructured data types (think JSON logs) and Snowflake expands support for string search, security teams are left with fewer excuses to live with the limitations of traditional SIEMs.

This could also be the year that industry analysts recognize the importance of security data lakes to cloud-centric security programs. Gartner has already presented the following security stack, with a data lake at its core:

Source: Gartner Top Security and Risk Management Trends

But that insight was limited to the architecture of XDR, rather than the entire security and governance program. Analyst reports continue to see security architecture through the old lens of a unique and independent problem/solution space.

In fact, data normalization, storage and correlation capabilities are all much further along in a different “Magic Quadrant” altogether. In 2021, expect industry analysts to shift their focus from vertically integrated solutions (Best SIEM, Top SAO) to use-case and outcome-specific solutions (Most Accurate Threat Detection, Fastest Vendor Risk Assessment). This change will be facilitated by the abstraction of data pipelines and storage away from the security application.

Success stories from last year’s early adopters will spread throughout the industry in 2021. As a result of this and the other trends described above, we’ll see widespread confidence in using cloud-native data platforms as well as new guidelines, best practices, and off-the-shelf vendor solutions. These will combine in a cycle that pulls security data lakes into the mainstream.

The hot list for 2021 🔥

What will happen when 99% of security teams find themselves on the wrong side of the new best practice? Things will heat up in a number of areas:

  • SQL becomes a hot skill for security analysts
  • Data portability becomes a hot feature for security products
  • Metrics become a hot initiative for security leaders

These changes will be for the better, enabling big gains for security posture with less busywork and lower costs. They are inevitable because of the incredible pace of innovation in cloud data platforms. And they’ll be enjoyable because infosec will find itself with new allies (such as data engineers) and new challenges (such as context-aware threat modeling).

Cheers to a great new year!

--

--

Omer Singer
Digital Diplomacy

I believe that better data is the key to better security. These are personal posts that don’t represent Snowflake.