Photo by Will Porada on Unsplash

Why you should not become a Data Scientist

Stefan Haas
CodeX
Published in
4 min readMar 26, 2022

--

First and foremost, this is a very opinionated article in which I will reference my personal experiences from my short career drift towards Data Science.

Nowadays, Data Science is getting a lot of attention because of various reasons. Such as it being announced to be the “Sexiest Job of the 21st Century” or just because of the broad misconception that AI and data-driven technologies will become predominant and take away lots of jobs due too automation. Or maybe just because of the great pay.

But as always, do not just follow the hype blindly as I did and avoid my mistakes.

Data Science might not be your cup of tea

It takes very unique skills and interests to be a Data Scientist which not everybody has. Obviously you need to enjoy Math and Statistics, because these are the foundations of any good data analysis. You need to have those technical skills, but also excellent social skills because as a Data Scientist you will have to communicate your results to stakeholders.

Need to know Skills:

  • Math
  • Statistics
  • Basic Programming Knowledge (Python, R)
  • Communication & Presenting Skills

As a Data Scientist, you will often find yourself doing research and investigating why X happened, or how to achieve Y. That is why you should be a person that prefers to do investigative work over implementing a solution to certain problems.

Data Science can be boring

The fun part of Data Science (for me) is building Machine Learning models to predict something. Those algorithms are extremely fascinating and take a very different approach to solving problems than traditional programming.

But building those models is only 10% of the work a Data Scientist is doing. The main part is wrangling and normalizing the data that has to be fed into those models. Wrangling, normalizing, transforming and aggregating data means that it is likely that you write a lot of SQL queries or something similar and execute query after query. Since most of the time the amount of data is pretty big, the queries will take a long time to run.

I personally have experienced waiting for my code to execute from minutes up to hours. It was not rare that I wrote a simple query, waited a few minutes for executing and repeating that over and over again. This was not very challenging at all as I initially had expected.

Data Science != Artificial Intelligence

Many young Data Scientists cannot wait to get into their first job creating super efficient Machine Learning Models, maybe even doing Deep Learning. But then realizing that the work a Data Scientist is doing can vary a lot. Some Data Scientist may actually just do Deep Learning and heavy research, but many many others will just do SQL, Excel and very basic statistical models like linear regression. Most Data Scientists do not build their own Machine Learning Models from Scratch, but rather use some pre-built models like scikit-learn.

Data Science is very niche

Data Analysis is heavily dependent on the quality and amount of data. Since data is not part of the core business of most companies, many of them do not invest in a Data Warehouse nor in a Data Scientist because it is just a nice addition of the business, but not a necessity. Those companies that invest in their Data and hire Data Scientists do not hire many, also just because it is not the core business. You will often find companies having many Software Engineers, System Engineers, …, but just a few Data Scientists.

Data Science requires a high level of education

Since Data Science relies on Math and Statistics a higher education is necessary to get a job. A Masters degree is almost a prerequisite because 49% of current Data Scientists have a Masters and 28% even hold a PhD. Only 19% have a Bachelors degree.

Conclusion

In conclusion, the reality differs from the general expectations heavily. Even though the pay is often good, the entry barriers are enormous, and the job market currently is oversaturated because a lot of people want to get into Data Science.

If you see yourself enjoying investigating causes/making predictions over implementing solutions, and you have or are looking to have a higher level education — then go for it. Data Science is definitely not for everyone, but might just be the right thing for you.

If you do NOT see yourself enjoying investigating causes/making predictions over implementing solutions, and you do not have and are not looking to have a higher level education — then DO NOT go for it. If you prefer implementing solutions a career in Software Engineering might fit you better.

--

--

Stefan Haas
CodeX
Writer for

Senior Frontend Engineer @Blockpit | Microsoft MVP | Nx Champion | https://stefanhaas.dev