6 Perspectives on How to Fix VC with Data-driven Solutions

A Clubhouse conversation explores AI in early-stage investment

Elisheva Marcus
Feb 1 · 9 min read

Maybe you’ve heard the phrase ‘Venture Capital is broken’, but what’s behind that grim diagnosis, and what can be done to cure it? Recently, six thought leaders in venture capital came together on the Clubhouse app to talk about what might help.

So what’s actually going on here?

Increasingly unstructured data and lack of data presents a massive dual problem and simultaneously an opportunity, to fix VC. That’s precisely what brought together these six forward-thinkers in early-stage VC.

Behavioral scientist Eva-Valérie Gfrerer of MorphAIs Technologies, kicked off this conversation leveraging high urgency and topic relevance. The discussion unpacked commonly-shared challenges among the speakers, a hint of solutions, and a definite need to meet again to continue the conversation. In the room were at least 190 people interested in tech, VC, ethics, data, AI, and investment.

Across the VC industry, decision-making processes are often manual, subjective, and inefficient. This motivated the five speakers to gather and do something. According to Eva’s research in behavior, there is overconfidence among investors (as high as 96% in estimating correct decision-making results) and this doesn’t end well. Decisions are also of course driven by humans, resulting in similarity biases which can have negative impacts on the investment. She says the time has come to talk about it.

VCs face similar challenges, but how do they react differently?

Andre Retterath from Earlybird VC identifies the availability of verified data as one of the major challenges in early-stage investing. Startups and most of their investors are private and thus have no obligation to disclose information. This is different in public markets where hedge funds can leverage real-time financial but also alternative data, to evaluate listed companies at any point in time. Moreover, Andre points to the diverse spectrum of data availability: “The earlier you invest, the more qualitative; the later you invest the more quantitative.” So how do early-stage investors move ahead? While oftentimes missing ARR and unit economics, early-stage investors must ‘triangulate’ or try to figure out: is there something interesting in this founding team, budding company, or business model to really look at? Core challenges, therefore, include: finding private data, verifying that data, and combining it to make sense of qualitative information.

“The earlier you invest, the more qualitative; the later you invest the more quantitative.” — Andre Retterath, Earlybird

Francesco Corea of Balderton Capital looks at the situation from an academic perspective and shares a similar view to Andre’s. He says the biggest challenge is the lack of standardized data; obtaining data is also an issue, and so is cleaning that data up. Plus there is a high cost of data providers to factor in. Francesco feels that analyzing the data is less of the hard part. Also, once you do have published data, even if limited: is there a way to have investors figuring our KPIs and flags to figure out a company’s potential?

This brought him to ask: Can we look back at other companies to see what made them successful? Let’s build some models around that. (No neuronets etc., just check on easier things like random trees, and support vector machines or SVMs.)

He warns you might run into weird things though: like the fact that having a phone number on Crunchbase is apparently a likely indicator for success. So you have to add ‘financial flavor on top’ and have humans make sense of the results.

Sarah Guemouri from Atomico who also invests in high-tech companies believes you have to build a healthy core of data. But missing the infrastructure so far, she notes data has been under-leveraged and that’s before you can even get into AI. She sees tremendous progress with this group of speakers though. Sara suggests focusing on building a high-quality database of early-stage companies based on real data and qualitative input from one’s investment team.

She adds you should build institutional knowledge and build feedback loops. Follow the evolution of companies, noting how they move through their funding rounds. Leverage external data sets to offer the team additional insights. Tech should not replace people but rather help them prioritize. Investment teams have so many companies to look at, so help them know where to focus.

“…build institutional knowledge and build feedback loops.” — Sarah Guemouri, Atomico

Henrik Landgren from EQT Ventures points out other data issues: there is no one known truth, but instead pockets of data…The question is how to match or mash the different sources especially when inevitable conflict management arises there. In his work, he has realized that once data is collected and structured, it needs to be easily accessible for the investment team to actually rely on it. The holy grail is to build a platform that is easy to use and people want to use it to manage their whole day. Then he can use the feedback loops to train the screening algorithms and help people to take the right actions to jump on the best deals and to fight the bias and the noise, (like who is screaming loudest, whether that is a VC or the companies themselves.)

The goal is to ‘get over the line,’ to provide the best recommendations to the investors, but first, you have to do the work. You have to accept that the first data might be poor until you have more data to work with. To do the work, you need to get a proper budget and do it step-by-step and it takes time. But he’s convinced that is healthy learning.

“The holy grail is to build a platform that is easy to use and people want to use it to manage their whole day.” — Henrik Landgren, EQT Ventures

How to Create Trust

Eva asks: How can you create the trust necessary to get investors to change their behavior and to accept an AI recommendation? Henrik admits it takes time to convince people because they have had success doing what they did before, so they are not anxious to change.

(In my own view, that’s exactly when you know innovation is most needed!)

Andre initially faced some skepticism when speaking to VC experts about his data-driven approach. But at Earlybird, he convinced the partners first, then the investment team that this works. He says that it is important to have the buy-in of the full team before starting to build such an infrastructure. This top-down approach was a necessary route for an investment firm with an established 24-year successful history.

For Henrik, the change at EQT was gradual but today they all want it to succeed, so it has a growing usage. To build trust, you have to see the small things that make you faster. He is a product owner for this, along with his colleague, Anton. They work with real investment decisions and process improvement. It is with that dual-lens they can keep prototyping and doing things better. Their goal is to achieve the feeling from users: ‘If I start to use the platform, I can get value back,’ In that way, they continuously feed the tech with data.

From Day 1, Cedric Waldburger from Tomahawk Ventures focused on a data-driven vision and a transparent & process-driven approach, especially with clear steps for founders. He was on the founder's side already. But not all investors are respectful of founder time– time that can be used to make users or customers happy. He reminisces about when Terminal was his most used application. He is quite curious about Henrik’s UX, actually!

Cedric pursues a mental model of discovery and exploration, even if it is hard to build new models to score opportunities reliably. He maintains a strict focus (FinTech and DeFi) and builds more models to score companies.

Eva is also a builder. She is building AI to focus on early-stage investment pattern intelligence. She has focused on founder intelligence, asking how is the team set up, is there diversity in the founding team: then they run accurate models. They have created a “morph score” of 0–100 to predict the likelihood of a startup’s success.

She’s keen on finding the startup founders that are successful, and to find the unseen opportunities (not necessarily founders with the classical McKinsey background), but rather the disregarded founders. With tech at the core, they plan to raise a fund for early-stage investment.

Buy or Build?

Andre reminds everyone that his goal is to ‘Get a company on the radar as soon as the founder is thinking of it.’ In his analogy of an Excel sheet, this translates into many rows — the identification part. Once a company is identified, he tries to enrich these entries and collect as much information per company as possible which translates into many columns. Only then does he start to combine deterministic and ML-based filters to score the companies. Based on this three-staged structure, Andre asked everyone, whether they would buy or build the tech they need? To that, Sarah says buy the identification and enrichment data but build your own integration and scoring in-house. She warns it is not going to be pretty now nor have a nice front end. Similarly, Henrik says they prefer to buy data sets but build their own platform.

Tackling Bias and Ethics

To wrap up the session, audience members asked about how to confront bias and handle ethical issues. They sought to know about using Natural Language Processing (NLP) and whether recommended startups arose from outside investor networks?

Henrik says they not only add performance metrics, but also over-index on underrepresented sectors and try new non-CNN (conventional neural networks) approaches. Sarah upholds a mission and objective to fund diverse people, so they train the algorithm to look for those criteria. Eva indicates that one can de-bias the AI by eliminating gender, race, ethnicity so that the machine learns to not use that. She says there are mathematical ways to implement this, like inductive approaches to put expert knowledge into the machine. (But that would require another Clubhouse session to cover!)

She says ML is based on deductive reasoning; it is not biased inherently but it is limited by data that is being inputted into its learning. To make this more inclusive, we need to be the agents that will change this. By using plausible assumptions and inductive methods and not just ML, she says we can help tech make fairer decisions and long-term solutions. Her end goal is behavioral change, but suggests using tech beforehand to accelerate that change – because recommendations are less biased than humans would be.

“To make this more inclusive, we need to be the agents that will change this.” — Eva-Valérie Gfrerer, MorphAI

Francesco posits that ‘ethics is a technical problem.’ With a lack of granular data points, investors use proxies for what they think is a good thing. An example mentioned was the tendency to invest in Stanford graduates in the US simply because of their alma mater. The bias is not because we are bad people, he says, but rather because humans use an approximation for data that compute well.

Overlapping Sourcing & Ecosystem Maturity

So if everyone uses the same data then is it all getting more competitive?

Andre says the data sources are pretty straightforward and suggests checking out his recent article indicating that VCs who use data have an edge today. Tomorrow, however, he expects that the focus will shift more towards an intelligent selection and that it will ultimately come down to VC brand, personal brand of the investor, fund size, and the personal relationship to the founder to win the best deals.

Francesco agrees and believes specialized funds will focus more on a niche. He expects that not all funds will like the same startups because of a different focus, or because of the subjectiveness when accessing the founder team. Sarah believes more funds can take a platform approach where experienced operators can add value (hiring, go-to-market strategies, sales partnerships), in addition to capital.

“The founders still have to choose you, as much as you are choosing them.” — Francesco Corea, Balderton Capital

Eva reminds us that even if VCs arrive at the same point, they are still missing the entrepreneur perspective. A VC still might not close the deal. It’s a 2-way road, she says. Francesco adds, “They (the founders) still have to choose you, as much as you are choosing them.” Andre closed by simply saying that while these novel data-driven approaches make VCs more efficient and effective it is still ‘humans investing in humans.’

Thanks to the speakers for sharing your experience and expertise with the audience. Stay tuned for future sessions, perhaps on data sources & screening.

Feel free to share or 👏🏾 if this resonated so more people see it. You are invited to keep up with Earlybird VC or me on Twitter for more events & insights.

Earlybird's view

Thoughts and news from the Earlybird VC team.