How to Start the Journey to Becoming a Data-Driven VC

Lukasz Karwacki
Sunscrapers

--

Venture capital (VC) has traditionally been considered a people-centric business, where experienced finance professionals rely on their skills and networks to identify and select the best investment opportunities. The process was — and still is for many firms — manual, human-based, and not easily scalable.

Recently, a new trend is emerging — the data-driven VC. Forward-thinking VC firms are seeking competitive advantage through the use of advanced technologies such as data engineering and AI. Some investors even believe that the future will belong to “quant VCs” — fully automated firms driven by AI agents.

Let’s take a closer look at what becoming a data-driven VC entails and what the journey looks like.

What Does It Actually Mean to Be Data-Driven?

Being data-driven means relying on data — not just intuition — when making business decisions. Recent advancements in cloud storage and data science have dramatically increased the potential for machines and software to assist at various stages of the deal funnel, taking on significant workloads. Today, we can gather and analyze more data than ever before, which greatly enhances the potential for receiving timely, data-backed insights.

To be more specific, here are examples of how software can help VCs at different stages of the investment cycle:

Find

  • Source people and company data from multiple external and internal sources to identify the best investment opportunities as early as possible.
    Example: EQT Ventures’ Motherbrain
  • Match startups with investors, and vice versa.
    Example: NFX’s Signal
  • Monitor the market to track recent deals and transactions, allowing you to benchmark these against the data in your CRM.

Decide

  • Perform (semi-)automated due diligence to assess company performance, customer feedback, or market sentiment.
  • Benchmark a company’s performance against competitors or broader market segments.
    Example: SVP’s Scale Studio
  • Manage customer relationships (CRM) and streamline internal communication to aid in the decision-making process.

Win

  • Indirectly position the fund more favorably in the eyes of founders by:
    - Demonstrating a strong understanding of technology.
    - Highlighting the benefits of a data-driven platform.
    - Showcasing the fund’s success metrics.

Help

  • Provide platforms for portfolio companies to share knowledge and resources.
  • Offer programmatic support for recruitment or vendor management.
    Example: SignalFire’s BeaconAI

Exit

  • Manage and maintain communication with a database of potential buyers to facilitate exits.

The decision on which technology to invest in during the data-driven transformation will depend on a fund’s size, specialization, focus areas, and overall strategy.

What Type of Funds Can Benefit the Most from Becoming Data-Driven?

While nearly all funds can benefit from integrating more technology into their operations, a data-driven approach is especially promising for early-stage funds. At this stage, venture capital is largely a numbers game, with hundreds of data sources and millions of companies and founders available for analysis. Managing such a large volume of data requires a more sophisticated technological approach compared to, for instance, analyzing the performance of portfolio companies, which typically involves tens or hundreds of entities — not thousands or millions. (That said, financial performance analysis itself can also evolve in many interesting ways with the right technology.)

Ultimately, the impact of technology depends on a fund’s specific focus and how creatively it leverages data to gain a competitive edge.

Make vs. Buy: The Decision Process

Should you buy off-the-shelf software, or should you build a custom solution instead? The answer depends on your goals and available resources.

Buy

With around 500 SaaS tools on the market, there’s an abundance of options that can address most, if not all, of the needs VCs typically have — and often at a very reasonable price. This makes it a low-risk way to get started on your journey, allowing you to test several tools and further develop your data-driven vision. Many SaaS providers also offer educational content that can accelerate your learning curve, particularly around operating frameworks, best practices, and industry standards. Affinity and Harmonic are good examples of companies that provide this type of support.

The downside of using off-the-shelf products is that they offer the same functionality (and often the same signals or leads) to both you and your competitors. As a result, relying solely on these tools won’t give you a true competitive edge. You may also find that some functionalities are too shallow or inadequate for your specific needs. Additionally, third-party databases may have varying levels of data quality. When selecting a vendor, you’ll also need to consider factors like data ownership, software security, and compliance, as VCs deal with sensitive financial information.

Make

On the other hand, building a custom, proprietary solution offers a fully tailored product that aligns perfectly with your specific needs and goals. A well-designed custom tool can provide a significant competitive edge, as no one else will have access to the exact same solution. You have full control over the functionalities, integrations, and data sources, allowing you to create a product that is uniquely advantageous for your fund. Additionally, you can ensure that security protocols and compliance measures are aligned with the highly sensitive nature of venture capital data. As your fund evolves, the software can also be adapted or expanded to meet new demands, offering long-term flexibility.

However, developing custom software comes with certain challenges. First, it can be expensive and time-consuming to build. In addition, ongoing maintenance and support are critical. Unlike SaaS, where updates and improvements are managed by the vendor, you’ll be responsible for keeping your custom software up-to-date, bug-free, and secure. Lastly, you may end up reinventing the wheel by building solutions that already exist in more refined forms through third-party providers, which may not be the best use of resources.

With the benefits and drawbacks of both the make and buy approaches, the question becomes: which should you lean toward?

The Real Journey: Make and Buy

A balanced approach — combining both buy and make — often makes the most sense as you progress on your data-driven journey. Below is an outline of the typical steps funds take along the way. Keep in mind, this is a generalization — funds may implement only a selection of these steps, and their order may vary.

Step 1: The Basic Setup

Most funds start with a CRM system, such as a generic platform like Salesforce or HubSpot, or a VC-specific tool like Affinity. The next step is to establish an online knowledge base, using platforms like Google Drive or Notion. Finally, a communication tool such as Slack rounds out the basic setup.

Step 2: Internal Datasets and Simple Workflows

The journey becomes more exciting when you start feeding your basic toolset with data.

Most funds will have access to several internal datasets, such as:

  • People and companies within their area of interest,
  • Email outreach databases and statistics,
  • Lists of events (tech conferences, meetups), possibly including exhibitors and attendees,
  • Notes from investor meetings with startups,
  • Notes from internal meetings.

This provides plenty of data to structure and migrate into your CRM and knowledge base. At this stage, you may also consider setting up workflows and integrations that automate data movement within and between tools based on specific triggers or events.

Step 3: External Datasets and In-House Infrastructure

The next step involves enriching your internal data with external sources, such as:

  • People and company data from LinkedIn,
  • Funding data from Crunchbase,
  • Product reviews from G2,
  • Founders’ and employees’ activity on social media platforms.

To achieve this, you’ll need to invest in a variety of tools and solutions, including:

  • Licenses from external database providers,
  • API integrations via tools like Zapier or proprietary software,
  • Data scraping tools like Scrapy or Zyte, or custom scripts.

Integrating data from multiple internal and external sources will also require you to set up a complete data infrastructure, such as:

  • A data lake and/or data warehouse to store both structured and unstructured data,
  • Data pipelines to extract, transform, and load (ETL or ELT) data into your cloud storage provider,
  • Scripts for data clean-up, categorization, and enrichment (e.g., merging duplicates, correcting errors).

The more data sources you have, and the larger the volume of data, the more complex your setup will become. It’s essential to carefully plan the architecture to ensure it meets both your needs and resources, resulting in a system that adds real value while remaining efficient and maintainable.

Step 4: Research and Data Science

Once your data infrastructure is in place (and only then), you can employ data analysts and data scientists to uncover non-obvious insights and trends. For example, you may analyze what successful companies and founders had in common in an attempt to identify future winners.

This type of research is non-linear, and there’s no guarantee of meaningful conclusions. However, when you do find valuable insights, it can provide a significant edge.

Step 5: Apps and Dashboards

For data to be actionable, users need intuitive ways to access, manage, and visualize it. This leads to the creation of dashboards using business intelligence tools like PowerBI or Tableau, or even custom web applications that serve as a single source of truth and the central interface for interacting with data.

Step 6: Planning Ahead

The largest and most innovative funds go beyond building their own data warehouses, machine learning models, and custom applications — they also invest in what I’d call “observing the tech space.” For example, I’ve seen a leading hedge fund partner with database developers to gain early access to their roadmaps and collaborate on implementation ideas. This level of investment is often reserved for the top-tier funds, but it demonstrates how far some are willing to go to stay ahead of the curve, finding and integrating new technologies before their competitors.

As we can see, software development can no longer be treated as a one-off project. It’s an ongoing investment in building, maintaining, and improving the tools we use. Typically, the most efficient and advantageous strategy for VC firms is a balanced, phased approach that combines off-the-shelf tools with proprietary software.

The Leadership Team

Do you need to set up an in-house tech team, or can you outsource development? What roles should you consider? How big should the team be?

A lot of these answers will depend on your attitude toward technology. If you consider technology to be a core or strategic pillar of your company (which is essential if you want to call yourself data-driven), you’ll need at least an in-house Product Owner.

A Product Owner doesn’t necessarily need to be highly technical — it could even be a part-time responsibility of one of the partners, such as an Operating Partner, a Platform Lead, or a Product Manager. Regardless of the title, the person should possess the following qualities and attitudes:

  • Full ownership and responsibility for tech initiatives,
  • The ability to navigate, negotiate, and align various stakeholders: partners, investors, and the tech team,
  • The ability to set goals that align with the fund’s strategic objectives, prioritize initiatives, and make strategic decisions,
  • A hands-on approach to analysis, problem-solving, or testing when required.

The second key role, which might not be necessary at the outset but becomes important in the long term, is a VP of Engineering or Data Science. Close to the leadership team, the VP of Engineering can:

  • Participate actively in goal setting at the company level,
  • Educate non-technical executives on the options and consequences of technical decisions,
  • Lead the tech team while engaging in creative thinking and planning for future initiatives.

The Tech Team

Software development, much like venture capital, is still a people-driven business — at least for now. (Whether AI will eventually replace software developers and/or venture capitalists remains to be seen. As of today, AI primarily extends human capabilities rather than replacing them.)

Success, therefore, largely depends on the individual engineers involved, with less emphasis on the form of their involvement — whether they’re full-time or part-time, local or remote, in-house or contracted. Among VC firms, I’ve seen both large teams of 20–30 in-house engineers and data scientists, and smaller teams of 10–15 external, contracted engineers led by a VC’s operating partner. A hybrid approach, combining in-house and external engineers, is also quite common.

How you structure your tech team typically depends on business factors such as company culture, operating principles, cost, flexibility, ease of recruitment, and project specifics, rather than an assumption that in-house or local talent is inherently more skilled or effective.

For your first tech hire, it’s wise to consider an experienced, all-around engineer who can handle a variety of tasks. This trend can be observed across the entire tech industry, not just within VC firms. High salaries for engineers and the current economic climate have led many companies to optimize their spending by seeking versatile setups. As a result, junior developers are finding it difficult to land their first jobs — despite their lower salaries (often one-fourth that of senior engineers), their output may be significantly smaller in comparison.

It’s also important to note that experience with a particular technology (such as Python as a programming language, AWS as a cloud provider, or Databricks as a data warehouse) is more valuable than experience with a particular tool (such as Affinity as a CRM, People Data Labs as a data source, or Notion as a knowledge base). Technology expertise encompasses much broader know-how — such as programming philosophies, best practices, and familiarity with a range of frameworks, libraries, and services. Mastering a programming language or cloud platform can take months or even years of hands-on experience. In contrast, learning a specific tool typically boils down to understanding its API, which shouldn’t take a seasoned developer more than a week or two to master.

At present, the VC industry doesn’t face any highly specific technical challenges unique to the sector. Unlike trading, where quant developers have become a distinct class of programmers, the technical use cases in VC are well within the capabilities of technologies commonly used in other industries.

Conclusion

Becoming a data-driven VC is no longer a distant aspiration — it’s increasingly a necessity for funds that want to stay competitive and make smarter, more informed decisions. The journey to harnessing the power of data and technology is not a one-size-fits-all approach, but rather a phased evolution that depends on each firm’s specific goals, resources, and growth stage.

From establishing a basic tech stack to building custom infrastructures that integrate internal and external data, the process can be as complex or as streamlined as your business demands. The choice between buying off-the-shelf tools or building proprietary solutions isn’t binary; a combination of both, adapted to the unique needs of your fund, is often the best way forward.

Along the way, having the right leadership and tech talent — whether in-house or outsourced — will be key. As the landscape continues to evolve, investing in a versatile, experienced team and the right technology will not only improve efficiency but also provide a valuable edge in a competitive market.

Ultimately, software development and data-driven strategies aren’t just one-off projects — they are ongoing investments that will continue to shape the future of venture capital. By embracing both technological innovation and human expertise, forward-thinking funds can set themselves up for long-term success.

--

--

Lukasz Karwacki
Sunscrapers

Co-Founder & CEO at Sunscrapers | Helping companies win with software, data, and AI