Data-Driven VC: how data can help source & screen startups?

Abel Samot
Red River West
Published in
7 min readOct 6, 2022

There are more and more brilliant startups across the world. And at Red River West, we are convinced that no VCs operating in wide geographical areas like the EU can efficiently spot and track all of them without the right suite of tools. I mean, how could you know of a promising startup in Poland or Estonia without an intimate knowledge of these markets, little network there, and no tools to track them?

However, there is no “off-the-shelf” tool on the market that would meet such needs, so a few years ago, we decided to dedicate significant resources toward the development of a proprietary tech and data platform for internal use.

For the last 3 years, our tech team has been focused on building one of the most advanced data platforms for VCs in Europe.

In this article, we will speak about some of the key takeaways of this approach and try to give you an overview of what is possible to do and how.

I/ Data Driven VC can mean very different things:

Data-Driven VC is the “art and science” of using data and tech to enhance VC performances.

… That’s not very clear, right? 😅

Well, it’s because funds have different strategies, therefore, they prioritize specific use cases at different stages of the VC “funnel” and leverage data and tech accordingly.

Different stages of the VC funnel explained

Here are a few ways “data-driven” funds use data & tech:

  • Better connect their communities and advisors to their portfolio (ie. the startups they have invested in).
  • Automate the financial reporting of their portfolio.
  • Perform startup Due Diligence thanks to complex machine learning algorithms.
  • Having a tailor-made CRM to handle all of their contacts and investment opportunities.
  • Enhance startup proactive sourcing with data science.
  • Screen startups automatically.
  • And much more…

Between two different VC funds, the needs and amount of information available can vary greatly depending on their strategy, investment stage, etc.

When building sourcing tools, for example, early-stage funds don’t have much data about the startups’ metrics yet and thus have to focus on the founders’ analysis. On the other hand, late-stage funds might have enough publicly available data to know a lot about an opportunity without even meeting the team.

Beyond the investment stage, each fund also has its own strategy in terms of geography, sector, and business model, that’s why this subject is so complex and why there is no “off-the-shelf” tool that fits every fund’s needs.

II/ How to use data for VC sourcing & screening?

Sourcing and screening are some of the most time-consuming and repetitive tasks for a VC. These tasks (often done by junior VCs) are crucial as they are at the beginning of the funnel and allow VCs to identify the startups they are going to invest in.

With a growing number of startups but also a growing competition amongst VCs, it’s one of the most obvious parts of the VC work that can be enhanced by data & tech.

Indeed, we believe that by leveraging data to spot the most promising startups and perform the first analysis, a VC fund could help its analysts be 10x more efficient at these tasks, allowing them to focus on the most promising opportunities and on supporting their portfolio.

In a world where competition for the best deals is fierce, algorithmic sourcing also allows a VC fund to be ahead of the pack by identifying exciting opportunities before they launch a formal fundraising process.

For funds operating after the seed round

By relying on our experience building these types of tools at Red River West, we compiled a 4 steps approach that could be used by any funds operating after the Seed round to build a data strategy for sourcing and/or screening.

Whether you have a 50k€ or 1M€ budget per year, your approach could be quite the same even if the result won’t be!

How to partially automate sourcing & screening for Series A+ VCs?

  1. Get a list of startups within the scope of your VC fund from one or several data sources like Crunchbase or Pitchbook (most of them have reliable APIs). You could then refresh this data and save this list in a database on a regular basis.
  2. Use multiple other data sources (social networks, news sites, Github, etc.) to enrich the data you have on each startup in your database.
  3. Develop smart algorithms (relying on Machine Learning or simpler rule-based ones) to identify the most promising startups thanks to the data you have collected.
  4. Integrate that with your favorite CRM in order to follow the opportunities. It could be a CRM custom-made for VCs like Edda (ex. Kushim) & Affinity or a classic CRM like SalesForce or Airtable.

Putting these 4 steps in place is not easy and you can always go deeper, but once done, you should already have a viable tool that could greatly help your teams.

The quality of the platform will depend on the variety, reliability, and quantity of the data points gathered for each startup but also on the way these data are leveraged to extract insights. The latter requires a deep understanding of an investor’s mindset and a company’s operation model. The more data points and startups you track, the stronger the focus on your data pipeline architecture & use of complex services to handle this data flow needs to be.

The next step once you have collected these data and created these scores could be to create your own platform in order not only to find the best opportunities but also to analyze them much quicker and much better.

Having a platform with all the articles, social networks, funding information, founders’ profiles etc. gathered on a single page could make your analyst 5x more productive and allow them to get a grasp of a company at the first sight.

Of course, all of that doesn’t replace human analysis, the purpose is really to augment it.

Beware: this approach might not work for Pre-Seed & Seed funds as data sources like Crunchbase & Pitchbook mostly index startups that have already raised funds.

How could I do it if my fund invests only in Seed or Pre-Seed rounds?

Focus on people

In seed or pre-seed, the stakes are not the same as in Series A+ and the objectives of your data stack shouldn’t be.

The primary goal of Data-Driven VCs operating at these stages’ is not to rank the most promising startups depending on multiple data points anymore. It is to spot future successful startups even before they exist.

But how do you do that? Well, the only thing you can do is focus on founders.

The very basic approach is to use LinkedIn Sales Navigator to create alerts and triggers when an interesting profile becomes a co-founder or put “working on something new” in its Bio.

It’s very simple, you just have to compile a list of schools, companies & experiences that you consider “good” and configure LinkedIn Sales Navigator in order to send you an alert when someone with a background in your list will create a company.

The problem is that: almost all of the best seed funds already do it. So if you don’t go further, you will arrive exactly at the same moment as they are and for the most exciting opportunities, it might be too late.

So what could you do?

  1. Well, first you could link this database of people to the Official Bulletin of Civil and Commercial Announcements of each country you are operating in and merge this data into a CRM or database. It could allow you to know exactly when one of those talents creates a company and as founders often create their company before announcing it on LinkedIn, you shall arrive a little bit before other VCs and business angels.
  2. You could try to create alerts not just when a talent launches a business but at the moment it leaves its former job. For example: imagine if someone leaves Google and hasn’t announced his or her new position yet, there is a good chance this person is in the process of creating a startup. If you know about it at this exact moment, there is a good chance you will be one of the first to know about it!
  3. You could use and group data from multiple incubators & product platforms like www.producthunt.com and create alerts depending on it. You could even link that to your talent database!

With these approaches (even though they are very complicated to implement) you should be able to have nearly perfect coverage of all the opportunities before most of the other funds and thus significantly improve your dealflow.

I hope this article was useful! Now you should know a little bit more about data-driven VC and how to apply it to source and screen companies :)

In the next articles, we will expand on the biggest challenges while building a data platform in a VC firm before expanding on what we have built at Red River West!

Stay tuned 🤩

--

--