# Estimating the number of startups in Europe

--

Over the years of running GlassDollar, where we crunch a lot of startup data, I’ve had many investors and innovation managers ask me “so, how many startups are there actually?”. I believe the objective has always been to try to get a sense of how much we don’t know, whether the CRM contains a fifth of the market or a hundredth.

Why hasn’t this been answered yet? With millions of people employed in the space and billions invested in it every year, it’s fascinating that no-one can even just ballpark this. I think the problem is that, even with access to *more* data, one still doesn’t know how much of the truthful number one is missing. To figure this out one needs to think about this differently.

Therefore, my goal is to estimate this unknown and give everyone a clearer picture of how much they are missing. To answer this, I created a simple but insightful back-of-the-envelope model yielding:**123k Startups in Europe.**

In this article, I’ll first tackle the ever-dividing question of how to define a startup, before constructing the 3-part model providing us with our estimate.

# Defining a startup

I define a startup as follows:

- it’s a product-based company (no agencies or consultancies)
- its core business is enabled by technology (i.e. e-commerce is ok)
- it’s younger than 20 years

I recognize that this startup definition, like any, is not razor-sharp. However, this one works for us because technology-enabled and product-focused separates small companies well, which makes up the vast bulk of the distribution. Whether an Airbnb is a startup or not doesn’t move our needle by much, so we can leave this debate to others.

# The back-of-the-envelope calculation

To arrive at a sensible estimate, we need to take the following into account:

- How many new startups are being founded every year?
- How does that number differ to 10 years ago? (The growth in the number of startups founded year on year)
- What’s the survival rate of startups over time?

Together, these make up the model:

# Step 1. Number of startups founded in Europe last year

My starting point is the German public company-registrar data.

Every month, approximately 10k companies register. Startupdetector reports that 2,2% of those are startups, which I can back up as in the context of GlassDollar we classify this source, amongst others, as well.

1. Startups founded in Germany per year:

**10k companies** register in Germany every **month**, **2.2% of which **are startups, thus **220 startups per month, **and **2640** startups/year.

2. Startups founded in the EU per year:

Based on Crunchbase data, German startups make up approximately 10% of European startups. That means we’ve got **26000 new European startups/year. **For later, it’s important to note that we’ll assume those 10% have stayed consistent throughout the years.

3. Startups founded in the EU per year + pre-seed phase.

Not all startups make it to registration (we didn’t make GlassDollar a limited for over a year at the start), so to include very early pre-seed companies I’d double this amount to **52000 new European startups/year**.** **Doubling approximates the data in our own database tracking early-stage companies (from when they, for example, appear on twitter to when they register). If you find another estimate more reasonable or consciously want to omit pre-seed companies, you can plug in a different number at the end of this article.

**Step 1** yields: **52k new startups in Europe every year.**

# Step 2. Growth in the number of startups founded

As a proxy for the growth in the yearly number of founded startups, we can use the yearly number of startups receiving first-time funding. According to Crunchbase, that’s an average of **22% y/y** for the 20 years that we are considering.

For example, if 52k startups are being founded each year today, then 10 years ago, with 22% yearly growth, that would have been 8k.

Using first-time funded startups as a proxy for overall startup growth means we are making two assumptions: Firstly, we assume that the relationship between the two has been stable. One could imagine this being wronged because over time companies started being funded much earlier by the likes of accelerators (which one can see in the data). Secondly, we assume that Crunchbase’s data coverage has been consistent over time, which can be questioned because its own user-base and thus contribution has grown over time.

However, even if you only consider the growth of first-time Series A investments, you see similar growth over time (22.3% vs. 21.7%). Further, because y/y growth is fairly consistent over time (except in times of crises) it is reasonable to assume 22% growth for the purpose of our simple model.

**Step 2** refines our estimate of the yearly number of founded startups by accounting for **22% **growth, as shown in the graph below.

# Step 3. Factoring-in Startup Deaths

The final piece to plug into our model is estimating how quickly startups die so that we know how many approximately have survived at the end of each year.

There’s little existing research on this survival curve. For small businesses generally, there is some data available from the US Bureau of Labor Statistics, which puts the rate of surviving small businesses after 5 years to 47%. However, this likely underestimates the death rate of a **venture **startup. For such companies, there is some data on the matriculation rate between rounds from Mattermark and TechCrunch. Yet there’s likely a selection effect as it fails to represent those startups that haven’t ever gotten to investment and further it’s also difficult to adapt to our yearly timescale as one round does not necessarily equate one year.

What is obvious is that survival over time does **not follow a linear trajectory**. This makes intuitive sense. A company is a lot more likely to fail at the start than later-on when it evidently already has built a more successful product (or told a good story).

Combining the data on small businesses and the matriculation rate, and cross-checking this with our own, I think the following inverse distribution makes sense. After the first year 58% of startups are left, second 41%, 5th 22%, 10th 12.5%. Graphed, this looks as follows:

**Step 3** helps us account for the startup survival rate (following an inverse distribution) to estimate how many startups remain for each vintage year.

# Putting it all together

Adding up all steps in our completed model:

Graphed:

If you want to make other estimates for the three steps, then you can punch the equation into your own calculator, adapt it, and see what you get. I’ve done the math with the factors to determine how much a change in input changes output (elasticity) and am confident that the real number is between **80–200k. **Our simple back-of-the-envelope model is surprisingly robust.

# Conclusion

As an investor or innovation manager, you could now think about the yearly startups in your market and apply this formula. For example, for an agnostic investor focussing exclusively on Germany, there are **12k **total** **startups out there. For an innovation manager interested in manufacturing startups (which make up 5% of startups according to Crunchbase data) there are then approximately **6k** startups to consider **just in Europe.** How many of those have you met at events? I encourage you to look at your CRM so that you have at least a ballpark figure for how much you don’t know.

How many European startups are already on Crunchbase? After you filter out the dead startups, there should be around ⅓ of the 123k** **startups listed within the 20-year horizon we considered. Naturally, much less for early-stage.

**I’m Jan Hoekman,**Co-Founder & CTO of GlassDollar, the startup scouting company.

We are building the most comprehensive & clever startup data engine yet. With it, we help corporations such as Daimler, Porsche, Miele, Rabobank and others to integrate promising startups into their value-chain.

*Invaluable feedback was provided by **Fabian Dudek**, **Diane Salimkhan** & **Ann Wegener*