By Julie Kainz and David Arndt
The data infrastructure market is enormous, and largely untapped. The estimated TAM for cloud data platforms stands at $81bn, and even leading innovators have achieved minimal penetration. With Gartner predicting data infrastructure spend to grow by 6 percent in 2021, the opportunity for challengers, and the investors who can identify them, is vast.
Snowflake has taken an early (and significant) lead. But the company also provides a blueprint for the next generation of data infrastructure players to build and iterate upon. Data warehouses and query engines that can innovate both their product and their go-to-market strategy are the ones to watch.
The evolution of data warehousing
The data infrastructure landscape is increasingly complex. It’s evolved significantly from the relational databases pioneered by IBM in the 1980s, and the early warehouses of the 90s.
That evolution has gathered speed over the last ten years; businesses have progressed from collecting and storing unstructured data on-premise, to more sophisticated models that allow them to mine data for insights and improvements. It’s only in the last decade that warehouses and query engines in particular have taken off, with a focus on boosting speed, performance and cost optimisation.
A handful of technologies — and this is by no means an exhaustive list — are responsible for driving that change.
What data infrastructure 3.0 will look like?
So in a rapidly developing market, how do we identify the next big data infrastructure success story? Obviously, product innovation is important.
Innovation could either come as an improvement to a subelement of the existing stack, or technology that completely rethinks warehousing architecture. Either way, it will need to be significant: new entrants will have to prove 10–100x performance to gain initial market share.
One way for them to take an early lead is by adding speed and efficiency. Today’s solutions don’t fully leverage architectural improvements that would allow queries to run faster, and that leaves a gap for new entrants.
It’s also critical to look at where new solutions fit into the existing data stack. Positioning-wise, the big winners will be sticky. Addressing a range of use cases, areas of business and data sets will encourage uptake by different teams across organisations. Likewise, the ability to offer dynamic tooling so users can invoke whichever works best for the query, and a range of products that boost the performance of that tooling, will be key.
That’s because current cloud data warehouses are built to be expensive; every incremental query costs, and users can’t currently pick their own resources. New entrants that offer more flexibility could be very attractive.
There are upcoming players in both query engine and warehousing space, challenging incumbents.
Go-to-market and business model innovation
But to really see scale, innovators will need to match a strong product with a strong go-to-market strategy.
Land and expand was a model that worked well for Snowflake. Large enterprise customers are willing to buy a diversified stack, but tend to stick with products long-term once they commit. Instead of asking for a costly rip-and-replace, Snowflake started with one subset of data or business unit. They made it easy to get set up with “Snowpipes” that deliver data from source to cloud, as well as data prep partnerships. But new entrants might struggle to replicate Snowflake’s success, unless they make it easy to get started with specific use cases early on.
Snowflake’s pay-as-you-go pricing has also been popular with customers who prefer a consumption-based model, despite the higher cost at scale. New entrants might be able to gain market share by doing things differently and giving customers more cost control. Under current pricing structures, data can become prohibitively expensive, curtailing the full potential of infrastructure products. Scaling without concern over cost constraints could help companies make more use of their data, so providers could create compounding value.
Companies are never going to have less data. This means the value of contracts can expand over time, and is why net dollar retention is such a potent measure for software companies.
Where Snowflake has played by-the-book is in its enterprise sales approach. Though a lot of customers initially try the product self-service, Snowflake still typically sells from the top-down. It’s worked to date because data infrastructure investments are big decisions made once and expected to last a long time. But as enterprises continue to tool (most now use over 700, with multiple first-users across the business), there could be an opportunity to sell more bottom-up. Snyk has pioneered this approach in the security space.
Market leaders currently only hold about 10–15 percent market share of the very narrowly-defined data warehouse market. Since enterprises are open to buying more than one solution, there’s a massive opportunity for those that can prove significant performance improvements or commercial incentives.
Dawn and data infrastructure
At Dawn, we are proud to have backed many unicorns and rising stars in the wider data infrastructure landscape over the years. From data science & analytics (Dataiku, Quantexa), to data exchanges & marketplaces (Harbr), data governance (Collibra), big data cost infrastructure (Granulate), data integration tools, and cloud data warehouses — we continue to follow the rapid developments in the space with the utmost excitement.