THE HIGHEST STANDARD IN AI — AI GUILD #DATACAREER

There is no lack of data talent.

The Golden Age: Data Career in the 2020s (No 1).

Chris Armbruster
Published in
5 min readJan 16, 2023

--

While we have no definite numbers of the talent pool, you can follow my argument by looking at some significant trends in the last ten years.

  1. Online learning platforms are now universally available for Python, SQL, Data Science, and more.
  2. University and research have changed, and many graduates (M.Sc. and Ph.D.) are fluent in Python.
  3. In-person training has expanded, too; for example, the German state is financing participation in the 3-month bootcamps.

For 2020, the number of professionals worldwide was estimated at half a million. And while we have no actual numbers, it is realistic to assume that the talent pool is now larger than 500,000.

Yes, talent is distributed unequally. An online certificate does not make a data scientist (yet). Numeracy matters, and it takes a decade of education to build it. Probabilistic thinking is hard, and too many people don’t get it. Data illiteracy remains widespread. That said, lets’ do three things together.

  • Discuss what “no lack of talent” means for you.
  • Make plausible estimates of the talent pool.
  • Discuss how you influence the odds and start your data career.

Talent is overqualified

I regularly work with data talents, and I have done so for five years. The talent is from all over the world and looking for a first or second role in European hotspots like Berlin, Amsterdam, or Barcelona.

Talent is fluent in Python, often with years of experience. Take academia, for example; there is the astronomer building data pipelines, the neuroscientist working with neural networks, the physicist running Monte Carlo simulations, or the economist well versed in data visualizations.

More systematically, I see three significant shifts.

  • The sheer number of talent all over the world, as exemplified by the hundreds of thousand of certificates that have been awarded.
  • Top talent has 3+ years of experience with Python and Data Analytics tools and Machine Learning algorithms.
  • Companies have made some progress with use cases in production but still struggle with their data and deployment infrastructure.

I know from hiring, working with talent, and talking to employed people that talent may be technically overqualified. But there is another side to this: Companies could be much more ambitious with their hiring and more aggressive with scaling teams to achieve more use cases in production.

The Professional Pyramid

My working hypothesis, for Europe at least, is that the development of the data economy lags behind the availability of talent and expertise. I think there has been more talent than opportunity since 2021.

More use cases are in production, but the data economy has not taken off in Europe, and this has two consequences:

  1. There is a lack of expertise and leadership, indicated by the observation that companies have a hard time filling director positions, and retaining this type of leader.
  2. Good talent may need up to 12 months to secure a first position in the industry or startups.

The lack of leadership impacts talents, for companies need competent leaders before they can scale teams and production.

Data Talent

The Talent Pool

The data profession is young. Therefore the experience pyramid has an extensive base and very few at the top. Think of it like a country where most people are below 18 years old. The profession has existed only for ten years (except for the few people in research before that).

Here is my reasoning why the talent pool emerging from 2020 to 2022 is now larger than 1/2 million people, not even counting all the professionals in the first two years in the industry or startups.

  • Online platforms report awarding hundreds of thousands of certificates (course completion). For example, Udacity reported 170.000 by 2020. There is a large variety of courses, but a significant number are data-related. So I would estimate that online certification has produced 200k+ talents from 2020 to 2022.
  • Universities have rapidly launched degrees in Data Analytics, Data Science, etc. For Germany alone, I counted 50+ degrees, including Bachelor’s degrees. To this, you need to add all the highly numerate M.Sc. and Ph.D. in a field like Astronomy, Bioinformatics, Econometrics, Electrical Engineering, Neuroscience, and High-energy Physics — all now using Python tools. For Germany alone, I estimate 12–15k new talents per year, or up to 45k in 3 years. Worldwide I put that number at 100k+ per year at least.
  • A smaller number is the in-person training done by companies and bootcamps. Because these programs are shorter, like the 3-month Bootcamp, and rapidly increase the talent pool.

Companies must rapidly create opportunities at the level of Lead, Principal, and Director to make better use of excellent talent. Expertise and experience are requirements for professionals to lead bigger teams to scale use cases in production successfully.

Starting your data career

I frequently get the question of whether more training is necessary or desirable. However, if you already have 2+ years of experience with Python and data, it is unlikely to make any difference in starting a data career.

However, the following does make a difference.

  1. You are clear about which data role you are the best fit for, e.g., Data Analyst, Data Engineer, Data Scientist, Product Analyst, ML Engineer, Computer Vision Engineer, NLP Engineer, MLOps Engineer, and so on.
  2. You know which type of company you want to work for, e.g., corporate, startup, or consultancy.
  3. You have identified your preferred industry and are combining your data with domain expertise.

If you are still in training, the above translates into the suggestion to build an industry-specific portfolio. Instead o working with different data, focus on data from a specific domain and develop your portfolio cumulatively.

Why?

At a time when there is more talent than opportunity it pays off to shift the focus from technology to company use cases.

The AI Guild’s 2000+ Specialists

The AI Guild is Europe’s leading practitioner community in Data Analytics, Data Engineering, Data Science, Machine Learning, Deep Learning, NLP, Computer Vision, and MLOps.

Do you want to progress to Senior, Lead, and Director?

It takes 60 days to build your competency profile. You can find out more by booking the first conversation to gain more insights at https://www.datacareer.eu.

The AI Guild

--

--

Chris Armbruster
Fluent in Data

Director, 2400+ Data Analytics and Machine Learning specialists | Data Leader | Keynote Speaker | Use Cases in Production