Download the full article: https://ieeexplore.ieee.org/document/8821312
I. Setting the stage
If you have read some of my previous posts, you may know I am very bullish on data-driven funds. The rationale for my optimism is that I fundamentally believe that machine learning can bridge the asymmetric information gap between founders and investors, making both of their lives better and easier.
As part of my work and effort to try to advance the field a bit, I have been doing research on how AI has already been used in VC, but I also wanted to play with data and the following results are the first attempt to do so.
II. A bit of background
After the last financial crisis, the interest rates decreased exponentially and venture capital suddenly became an attractive option to achieve high returns. However, in only a decade the market moved so fast, got so mature and saturated, and so many empires have been created, that is now cumbersome to obtain sustainable returns investing in risky early-stage companies. In fact, capital is abundant nowadays and funds have been raised everywhere, while there is no scarcity either in companies of every shape and size.
For these reasons, investing has become incredibly competitive and it has never been harder to spot the needle in the haystack that would make you rich. Unfortunately, the toolbox investors currently have available is not robust enough to reduce their risk and help them managing uncertainty in a better way.
This is where machine learning can come to aid.
III. Using machine learning in VC
Machine learning can indeed support VC investors in multiple ways:
i) Helping investors spotting market gaps and general trends;
ii) Performing better portfolio management;
iii) Matching co-investors and deals;
iv) Obtaining intelligence on competitors’ landscape;
v) Identifying potential acquirers;
vi) Creating more accurate pricing/valuation models.
In other words, it has the potential to make venture investors better and more informed, even in the post-investment phase where they need to help companies to grow.
There is still one use case I have not listed, which is the one we focused on: using data to finding relatively unknown startups and understand their success potential in advance. Our ambitious aim, therefore, was to try to have a better idea of the likelihood of success of a company without using balance sheet or quantitative data (e.g., revenues, etc.).
Is this company going to be a blockbuster or a failure? This is the question.
IV. Previous signals and new success factors
I have already (partially) reviewed previous studies where data have been proved to help identify signals that are relevant to assess the success potential of a startup. Even though the list is quite comprehensive, every study usually tends to look at one single factor and a couple of different success scenarios (namely, acquisition and IPO).
In our work, we tried to have a more holistic view and use over 120,000 companies to spot signals not only for acquisitions and IPOs but also to compute the probability of raising a subsequent round of funding or shutting the startup down.
In the same fashion as backtesting, we created a time-aware approach and analyzed companies that were no older than four years old by 2015 and tried to predict their success in the following three years. We also used more than a hundred variables as possible explanatory indicators of success, as well as five different models: Support Vector Machines (SVM); Decision Trees (DT); Random Forests (RF); Extremely Randomized Trees (ERT); and Gradient Tree Boosting (GTB).
Between those algorithms, Random Forests and GTB seem to have the highest performance (up to 82%), especially for some classes, but it is always hard to select a model which is superior in all the instances. However, the good thing about those classes of algorithms is that we can deconstruct what happened behind the curtains and rank the features according to their importance (an example is shown below).
V. Concluding thoughts
If you work long enough in this space, you quickly realize that “automating venture capital” is harder than you think. Our ambition was not to do it, and having now worked with the data makes me thinking that fully outsourcing the work of a VC to a machine is a far away — if not impossible — task. This, of course, does not imply that we cannot build useful tools and supports to improve investors work, and this is what a tech stack should aim to be used for.
Furthermore, it is worth noting that having better due diligence and decision processes can favor investors but it does not solve all their problems. It is important to remember that this type of tool does not increase the investor’s ability to actually close the deal, but it only augments the capacity to process information and assess companies in absence of more traditional financial data. Whether the entrepreneur gets an investor’s money does not depend, in fact, by the ability of the VC to do her due diligence but is rather driven by establishing a personal relationship and providing some additional value to the mere monetary contribution.
In other words, VC may be broken, but it is still standing on its feet.
The article first appeared on Forbes.