Materialize: Streaming Without Compromises.

Ravi Mhatre
Lightspeed Venture Partners
3 min readNov 30, 2020

Cloud-native data infrastructure has arrived. Years of research, development, and innovation have led to the proliferation of accessible, scalable databases, running on distributed machinery in the cloud. Further, new paradigms like real-time event streaming have taken root, enabling a new generation of modernized applications and services.

But among all this change and upheaval, the core needs of data-intensive organizations haven’t changed.

Teams don’t want to wait overnight for batch jobs to complete. Analysts want to use a declarative query language, and organizations demand interoperability with other standard protocols. Latency is key. Cloud-native deployment is critical but should not prevent strong data consistency.

Put simply, speed and deployment modernity should not come at the cost of other functionality.

However, most existing data infrastructure approaches make trade-offs between performance and expressiveness:

  • Batch platforms are scalable but slow, constrained by the frequency of the ETL process, yielding stale results
  • Existing streaming solutions are fast but expensive and limited in capabilities, lacking the ability to handle multi-way joins or updates without additional overhead
  • Hybrid Lambda architectures combine approaches, resulting in meaningful implementation and maintenance complexity

Further, traditional data processing infrastructure is designed to repeatedly ask about the current state of the world via periodic, repeated queries, rather than nimbly react to changes as they occur (and importantly, only the changes).

Data still isn’t moving quickly enough, and merely faster, but still traditional, data processing infrastructure isn’t the answer.

Avoiding these compromises requires a fundamentally new architecture. We believe Materialize is that next-generation technology.

Materialize lets users ask questions of live, streaming data, connecting directly to existing event streaming infrastructure, like Kafka, and to client applications. Results are refreshed in milliseconds, providing correct and consistent answers to queries. Materialize allows users to perform complex computations over streaming data, with no need to build and maintain your own streaming microservices.

To clients, Materialize presents as a standard PostgreSQL interface, enabling plug-and-play integration of existing tooling. SQL queries are recast as data flows, which can react efficiently to changes in underlying data as they happen, avoiding expensive and wasteful recomputation. This enables interactive data exploration and data warehouse-like analytics against live relational data, which is typically not possible.

Materialize has been thoughtfully architected to address the weaknesses of existing approaches. Timely Dataflow (TDF), the stream processing engine at the core of Materialize, is a distributed data-parallel compute engine, capable of scaling up from a single thread running on a laptop to execution across clusters of distributed computers. TDF has been in open source development since 2014 and has since been battle-tested in production at large Fortune 1000-scale companies.

As a founding team, Arjun Narayan, CEO, and Frank McSherry, Chief Scientist, have consistently impressed us with their clarity of vision and understanding of their problem space. With top tier academic and industry backgrounds, both have deep technical experience and first-hand experience building cloud-native database technologies.

Like streaming data itself, the data infrastructure market is moving quickly. To keep up, the Materialize team has laid out an exciting product roadmap, building out functionality that synergizes well with its cloud-native, streaming approach, including stream persistence, active-active replication, tiered storage, and other functionality.

At its heart, Materialize enables streaming without compromises. As believers in this mission, we’re excited to share the news of Materialize’s public launch along with the closing of its Series B round for a combined $40M in funding to date. Previously unannounced, Lightspeed led the company’s Series A round and has participated in a Series B round led by Kleiner Perkins.

We at Lightspeed believe Materialize is a fundamentally innovative technology and has the opportunity to transform not just streaming analytics, but the entire data pipeline. We’re incredibly excited to partner with Arjun and Frank on this journey.

By Ravi Mhatre, Raviraj Jain, and Nnamdi Iregbulem

--

--