Beyond the Spreadsheet: Using Modern Data Science as Growth Investors
“Do we share the management’s vision?” This is one of the fundamental questions due diligence is trying to answer and which drives the investment decision of every investment team. Founders play a big role here, as it all starts with them articulating a compelling story for the company to share with investors. But no matter how convincing — or convinced — managers are, investment teams need to run a strategic due diligence to get comfort on the vision they are buying into.
Traditionally, strategic due diligence would heavily rely on experts: individuals with extensive experience in the industry. But like everyone else, experts can fall prey to cognitive biases (see our newsletter on pros and cons), and as disruptors take the center of the stage, decades of experience with an industry’s current codes might not be the best approach. But what is the alternative?
While experts’ views on an industry remain invaluable, we now live in a world where using data has spread to every corner of our lives, and is making its way into private equity. Beyond the reluctance of some to seeing their ways of working evolve, the initial reaction is often “we have no data”, as internal company data is often the only type we can think about. However, data is increasingly becoming publicly available with every day that passes. And this data comes from a myriad of sources, ranging from governments’ open data initiatives to social media APIs and website-scraping robots. There is a lot more out there for us to use than we believe, the crux is knowing where to look and having the right tools to exploit it.
Data science stacks (R, Python,…) enable us to gather large amounts of granular data, and use this data to run simple but powerful analyses to back-up the management vision. It is, for example, scraping all the profiles on a company’s platform and on its competitors’ to validate that the company has indeed had a stronger growth in users than the competitors, and to also know since when and to what extent. At Gaia, we used this to validate Welcome to the Jungle’s edge over competition before making our investment decision.
Sometimes, analyzing mountains of data would traditionally require hours of manual work, and that’s when using modern data science gives us access to the entire arsenal of machine-learning-powered tools. Let’s imagine you have gathered thousands of reviews for the company’s app on the App Store. What now? You would skim through all of them manually to “get a sense” of what is going on? No, Natural Language Processing (NLP) tools can do it so much faster: by identifying recurring words (even misspelled!), recurring topics (grouping words from a similar field) and by even identifying which topics are in positive or negative sentences. With these tools, you have access to what users like and dislike about the product and competitors’; does it match what management is saying? what about experts?
But why stop there? Now that you have access to all the machine learning algorithms, why not use them to answer questions you didn’t even know you could ask? Let’s say you have gathered many metrics on all the users of a platform: number of items for sale, number of items sold, number of items bought, number of followers, number of followings, number of comments, etc. You could analyze the average number of items bought or sold, but could you possibly find groups of users with similar behaviors? Clustering techniques to the rescue! A subset of unsupervised learning, they let you discover user segments directly from the data, with no a priori hypothesis. Resulting categories could show for example:
- “buyers-only”: who buy, never sell, and use the “like” feature a lot
- “likers-only”: who like but never buy
- “sellers-only”: who never buy but sell often
Such categories might not shatter your intuition — and that’s a good thing! it means business is functioning as intended — but let’s imagine a user sold 5 items, liked 150 and follows 1,500 other users: which category should they belong to? Hard to decide! The whole purpose of machine learning algorithms is to define these thresholds for you, based on empirical evidence, and to give you a clear view on what makes up the user base, and whether the product roadmap caters to the needs of each group.
We, at Gaia Capital Partners, are firm believers in what data can do to enable new businesses, but also to reinvent how we work as investors. Using data science in due diligence is a first step, which helps assess the visions founders share with us for their business. And we gladly share the insights which come from these novel techniques with managers as we partner for a new chapter of their adventure!