Can 3 years make a Data Scientist outdated?

Filipe Pacheco
6 min readSep 23, 2023

--

A brief review about my path

My first contact with the AI & ML landscape was 8 years ago, in 2015 during my college time. Since the first day that I saw all the classes that I needed to attend in order to receive my bachelor’s degree in Mechatronics Engineering, one class caught my attention, Introduction to Artificial Intelligence. Of all 58 classes in my undergraduate program, this one was responsible for defining and guiding my entire career until now.

At that time, most of the things that exist today, or the subjects that are in the hype nowadays, didn’t exist, or no one talked about them, at least in a college class. I finished my bachelor’s in 2016 with knowledge about Artificial Neural Networks (ANN), how to train them, best practices for utilization, Fuzzy Systems, and a broad view of Artificial Intelligence. I had already fallen in love with the Travelling Salesman Problem (TSP), while developing my final project applying Fuzzy Systems in a controller of a chemical process.

My first book about AI. Image available in Link

In the period between 2017 and 2020, I kept my interest in AI & ML alive, concluding a specialization in Control and Automation with the application of ANN, and started a Master's Degree in Mechatronics. Though I haven’t finished, I studied Reinforcement Learning applications in the chemical industries. While in academia, I had not heard about many technologies that are in the hype in the Data & Analytics (D&A) world right now, such as MLOps and LLM.

The importance of something changes over time

I started to work as a Data Scientist (DS) in 2021, and I cannot run away from it, so I did a search to have data to base my “speech” on. In the image below, I present the search interest for Data Science on Google Trends. It is possible to notice that when I started my undergraduate degree, the interest was 1 on a relative scale, in other words, I did not know that I would become a Data Scientist. Between 2017 and 2020, the interest rose from 27 to 52, almost doubling. This worldwide interest led me to change my career from a possible job as an automation engineer in a chemical company to a DS in a manufacturing company.

Interest over time in Data science, source Google Trends
Interest over time in Data science, source Google Trends

In 2021, I deployed my first model in an industry. However, I soon realized that maintaining it was not as easy as I thought when I was in academia. I usually say, training ML is for hobbyists; professionals must put it in production. If you have tried to deploy something before, you know that there is a large gap between training and putting it in production, maintaining data quality pipeline ingestion, avoiding drifting in data and model, and so on. In the late last decade, a research field and a term were created to cope with this enormous task: MLOps.

The interest in MLOps

Since the moment that I was hired in 2021, the search interest for Data Science has quadrupled in less than 3 years and has apparently reached a peak of stability. In 2022, I heard about MLOps for the first time, and earlier this year, I decided to start searching about it. I participated in training sessions and read available documentation on Databricks, which I use on a daily basis for all possible tasks related to D&A.

Interest over time in MLOps, source Google Trends

Another special way for me to learn is through reading. A few months ago, after receiving US$ 100 for being on the team that won an internal Hackathon of AWS Deep Racer (which I wasn’t expecting), I decided to invest the money and buy a book to complement my knowledge and identify any gaps I may have had. The book I chose was “Practical MLOps”, written by Noah Gift and Alfredodeza.

Practical MLOps (English Edition) — eBooks em Inglês na Amazon.com.br

After reading the book and realizing that the “importance of things” was changing, my logical next question was, “What else is changing?” I conducted a poll in a public group on LinkedIn that I am a member, Analytics and Data Science Career | Groups | LinkedIn.

Link to the pool: Post | Feed | LinkedIn

As you can see in the image above, I asked other people in the field which technology will dominate the D&A field in the next few years. The image shows that almost 50% are betting on LLM technology, and more than 75% are betting on generative AI tools/technologies to become even more prominent in the coming years. Due to that, I conducted a new search on Google Trends.

The interest in Large Language Models (LLM)

It is impressive how dramatically the search interest for this topic has changed in the last 3 years, from 1 to 100 in this short period. Just sixty days ago, my knowledge only extended to ChatGPT and how to use it daily to increase my productivity during coding. On LinkedIn, everyday I saw post related to how to fine-tunning a LLM, LLMOps (already), and Hugging Face, you really have to take a look. So, I began to question myself, “Am I already out of date with less than three years in this role?”

Interest over time in LLM, source Google Trends

Be or not to be

To answer this question for myself, and maybe you may question yourself about it as well, I needed to evaluate many things under several perspectives to define yes or no. For example:

  • Are my actual skills on a desirable level from the perspective of my boss and my boss’s boss?
  • Do I need to learn other tools and techniques to continue growing in my career, or do I want to specialize in the ones that I currently have?
  • Are my current skills still relevant for the market?
  • Are my skill sets, from a technical senior leadership perspective, good enough?
  • Last but not least important, can I deliver value to the business with my current skill set?

After all these evaluations, multiple 1:1 sessions, polls on LinkedIn, research on the internet, and so on, I concluded that it is better to be safe than sorry. Even though I do not always feel up-to-date, others may not see it that way. I decided to start my upskilling and new learning to avoid becoming “obsolete” in the upcoming years.

Conclusion

So I divided my upskilling and new learning into three phases. The first one was to gain ground knowledge in LLM, which I have already accomplished through three Databricks Academy training sessions. The second one is to upskill in ML in AWS, which I am currently doing, and I have uploaded my AWS code for SageMaker to my Github repository. The third one is to become a Cloud Practitioner in AWS, Azure, and GCP, which I will start soon after completing the second phase.

In the end, I feel glad to have started my new learning and upskilling before problems arose in delivering solutions in the company where I work.

When I start my adventures as a Cloud Practitioner, I will post my learning here on Medium. If you are interested in hearing about a DS learning about Cloud Infrastructure and so on, feel free to follow me and stay up to date with the news ;).

--

--

Filipe Pacheco

Senior Data Scientist | AI, ML & LLM Developer | MLOps | Databricks & AWS Practitioner