INTRODUCING OUR DATA SCIENCE ROCK STARS

Anya Tonne
Cmotions
Published in
5 min readMar 4, 2022

NO DATA SCIENTIST IS THE SAME! — part 1

Introducing our Data Science Rock Stars

This article is part of our series about how different types of data scientists build similar models differently. No human is the same and therefore also no data scientist is the same. And the circumstances under which a data challenge needs to be handled change constantly. For these reasons, different approaches can and will be used to complete the task at hand. In our series we will explore the four different approaches of our data scientists — Meta Oric, Aki Razzi, Andy Stand, and Eqaan Librium. They are presented with the task to build a model to predict whether employees of a company — STARDATAPEPS — will look for a new job or not. In this blog we get to know our team. From this introduction you can already imagine that their approaches will be quite different.

Most people who never work with data don’t actually know what data scientists like us do. My grandma still believes that I fix laptops and hack the government as part of my job. We know that there is not one type of data scientist, we all do what we do with data differently. THe explorer in us wanted to explore how different types of data scientist would deal with the same questions. And of course, we would like to take you with us on this journey! So here they are, welcome to the stage Meta, Aki, Andy, and Eqaan. They work as data scientists in an innovation team of a large company called STARDATAPEPS. The firm loves experimenting with data and staying ahead of data trends. In the articles to follow, we’ll see how they deal with the same challenges in their work, but let’s introduce our rock star team first.

Meta Oric: ‘It’s all about Meteoric speed ‘

Meta is always super busy. She has an impressive portfolio consisting of a number of cool sounding projects where she uses the latest top-notch models. She claims to use state of the art models with little to no modifications and implements everything with a relatively short timeframe. She never spends too long on any of her projects and prefers ensemble techniques that do not require a lot of data preparation or model selection. XGBoost and LightGBM are her best friends, even though Meta cannot actually explain what is going on inside her predictive model (to other, non-statisticians). Her scripts are always well written and are ready to be reused in the following project.

Aki Razzi: ‘Accuracy is what truly matters’

Aki has won multiple Kaggle competitions, since her models achieve the highest possible performance. Time and resources do not matter that much to her. Hail the almighty accuracy, precision and recall. She does not care whether a technique is easy to explain or not. Similarly, she is no stranger to using ensemble models to achieve the near-perfect performance as well as very convoluted feature engineering techniques.

Andy Stand : ‘Understand is what we do’

Andy does not believe that black-box models are the future. He often finds himself in a meeting where he needs to explain to his colleagues how his model works and how the predictions are generated. He is happy to sacrifice a bit of accuracy in order to achieve the most clear and understandable solution. He makes sure that his planning allows for extensive feature engineering. As a result, all of his features can be understood and explained to others. Simple regressions and decision trees are the most utilised tools from his toolbox.

Eqaan Librium: ‘Equilibrium and work-life balance’

Eqaan strikes for balance. He likes to prepare his data himself and to have control over the preparation process, although he does not mind to have some ‘black-box’ elements. He wants to be able to explain his models and data manipulations to others, but seeks to have as accurate of a model as possible. This approach does cost him quite a bit of time.

Task at hand: Predict which employees might leave

Can you identify with one of our protagonists more than with the others? All of us certainly can. Though we often wonder what it is like to be in the shoes of another data scientist. Well, now is your chance. Meta, Aki, Andy, and Eqaan just heard of a new initiative, and immediately decided to sign up.

Their company decided to tap into the land of HR analytics. The idea is to predict whether their employees will look for a new job. Since data science is crucial for their day-to-day business, every employee is highly valued and being able to predict who is looking for a job might decrease personnel liquidity and thus the loss of knowledge. Thus, the company acquired an HR analytics dataset to see if they can build a model that would predict if an employee is likely to change jobs or not.

The data science journey of Meta, Aki, Andy and Eqaan

In the following articles, we’ll first get to know the data we have available a bit better. After that, we’ll see how the approach of our data science rock stars differs from each other. What choices to they make and how does that impact the predictive model they build? Finally, we have a team meeting with our data scientist and briefly discuss strengths and weaknesses of their different approaches. This ‘retrospective’ enables them to learn from each other. We hope you’ll get something from this journey as well, we did!

--

--

Anya Tonne
Cmotions
Writer for

Data Science and Advanced Analytics Consultant @Cmotions