#PowerToTheDeveloper CRV’s Investment in LanceDB

CRV
Team CRV
Published in
3 min readMay 15, 2024

By Murat Bicer, Brittany Walker and Brian Zhan

AI workloads are all about the data. In line with scaling laws, more data has continued and will continue to improve model performance for the foreseeable future. When we were investigating today’s emerging landscape of AI infrastructure tools, however, it became clear that every company struggled with data storage and processing of multimodal AI data at scale.

Multimodal AI companies often deal with complex data types like vectors, images, videos and audio, but traditional tabular data formats are not designed for working with such data at scale. Moreover, retrieval is a bottleneck — most vector databases aren’t optimized for multimodal use cases, making it difficult to efficiently retrieve and analyze diverse data types.

Building for Scale with Lance and LanceDB

Throughout our diligence process, we heard time and time again that LanceDB was different from existing vector storage and retrieval solutions. Which is why we’re proud to announce today that CRV led LanceDB’s Seed round, bringing the company’s total funding to $11 million.

The open source Lance format offers substantial improvements in performance, facilitating up to 100 times faster random access compared to Parquet, and supports complex data structures integral to AI applications. And LanceDB’s open source multimodal vector database is able to index billions of vectors and petabytes of text, images, and videos stored in the Lance data format, at a fraction of the cost of other databases.

The two pieces of infrastructure are a powerful combination — LanceDB has become the go-to choice for leading companies like MidJourney, Character AI, Airtable and Hex that are building the next wave of AI applications. MidJourney found that LanceDB was the only solution capable of meeting their high-traffic, large-scale requirements, and Character AI observed a reduction in P90 latency by over 90 percent after migrating to LanceDB.

“Midjourney generates breathtaking imagery for millions of users worldwide. Vector search is critical infrastructure that allows us to better serve our users. We evaluated multiple solutions and LanceDB was the only one that could meet the high-traffic and large scale requirements we had.” — Nadia Ali, CFO of Midjourney

Brian Zhan, Brittany Walker and Murat Bicer of CRV Working Side by Side With LanceDB’s Chang She and Lei Xu in CRV’s San Francisco Office

It’s All About the Founders

Of course, building an exceptional startup requires exceptional entrepreneurs. Chang She, Lei Xu, and their remarkable team are not new to this arena. Both Chang and Lei have extensive experience building data and AI systems that scale — Chang is one of the original co-authors of pandas and Lei was a key member of the data infrastructure team at autonomous driving startup Cruise. We are thrilled to see them apply their expertise to solving the challenges of scaling this next wave of AI applications.

Chang She and Lei Xu, Co-Founders of LanceDB

At CRV, we feel privileged to have the opportunity to invest in founders with such strong founder-market fit during the most exciting platform shift of our time. If you’re an early stage founder with a strong perspective on today’s AI infrastructure landscape, reach out to our team.

--

--

CRV
Team CRV

CRV is a VC firm that invests in early-stage Seed and Series A startups. We’ve invested in over 600 startups including Airtable, DoorDash and Vercel.