Analyzing YC’s Top 100 Companies list by Year (vintage)

I looked at the data YCombinator published recently and ran a quick analysis on it. I scraped it and parsed the data into a CSV, along with some of YC’s historical data. Link:

Originally posted on my website.

  • I assumed “Rank” in this case implied valuation. I developed a “Vintage Rank” heuristic to determine the best vintages in terms of return on investment.
  • To look at YC’s overall investments and hit rates, I fetched the data from their past investments that actually launched to the public. YC says there are over 1900 companies in their portfolio, but I could only get ~1600 from their public database so about 15% of them did not launch publicly.
  • Hit rate: % of total companies invested (public) that went on to make the top 100 list.
  • When I say best or worst vintage, note that these are still referencing some incredible companies. I am merely comparing the vintages within each other on ROI.
  • Of YC’s winners, it seems that the 2005, 2007 and 2009 were some of YC’s best. 2017 has high potential to become another great vintage.
  • Vintage with the most number of companies in top 100: 2011, with a whopping 14% of investments making it.

Charts (more below, after analysis):

Key insights:

  • 2005: Early days of the fund. Big win with Reddit that likely set the stage for future years of YC. Per the Vintage Rank chart above, this was the best vintage ever. Begs the question, if this had not happened, would YC still be around?
  • 2006: 2006 has had the best hit rate of all the years in the early days of the fund. But these did not end up at the same Unicorn valuations as Reddit. High hit rate, lower valuations. Companies: OMGPop, Scribd.
  • 2007: High valuations, high hit rate. Very very good vintage, another good one in the early days of the fund. Companies: Dropbox, Twitch, Weebly.
  • 2008: One of the weakest vintages inspite of one great investment — Machine Zone. Lowish hit rate, Average valuations. Machine Zone still carrying this vintage with a $6B valuation.
  • 2009: Incredible vintage. Probably YC’s best, with one caveat: They have had more time than other vintages to develop their value. But if you assume inverse-power-law dynamics of technology markets, you can safely assume that this will remain that way. Companies include: AirBNB, Stripe, MixPanel. They have created a large number of jobs, even when controlled for the lifespan of the company.
  • 2010: One of their weakest vintages in terms of jobs created, especially if you assume the amount of time they’ve been in market for. However, in terms of relative valuation (rank) they are somewhere along the median rank. Low hit rate, lowish valuation.
  • 2011, 2012 and 2013: Whopper years in terms of the number of companies in the top 100, but very low relative valuations. Very interesting vintages. 2012 in particular has 18 companies in the top 100. In terms of jobs created, they are somewhere along the average. Caveat: They have had less time to evolve their businesses and have not yet hit the 10 year mark.
  • 2014: In terms of Jobs created, 2014 is actually a really strong year (only second to 2016), even when controlled for the amount of time they’ve had to create those jobs. Reason is because this includes Cruise, and Flexport — two operationally intensive businesses that needed lots of humans. They have also had some of the best valuations between the 2012–2017 range (see Average Rank chart)
  • 2016: 2016’s jobs created number is extremely high due to Rappi, another operationally intensive business*. Still too early to call the success of this cohort.
  • 2017: Very interesting in terms of rank and valuation. They have one big ranker — Brex. And Faire, which ranks at around 51. Probably on track to become one of YC’s biggest vintages ever, maybe even beating 2009.

* I am not sure if YC is including jobs created by the company prior to the investment being made, this might inflate the number.

Interesting questions to ask:

  • Clearly, 2005 was a great year and YC’s success with Reddit set the model. If that had not happened, would YC have gone on to enjoy the same success over many years? i.e did Reddit and the early years carry the rest of the fund?
  • Has YC gotten better at predicting big wins? It is hard to say from aggregate data, since we’d have to run a cohort analysis. But the vintages 2011–2014 were really good in terms of predicting hits, a marked improvement over their initial days of starting the fund.
  • Do jobs created in year 1,2,3 after investment serve as a good proxy for valuation? YC should be able to get pretty good at predicting which of their investment will generate serious returns.

More charts: