What’s up with Data Science?

Harjot Pahwa
Analytics Vidhya
Published in
6 min readJan 31, 2021

What I learned about data science through my career

Data science is one such job whose requirements are not well defined. The same profile can have varying requirements depending upon the work item you’re coupled with. Sometimes you have to pull up an excel sheet to present some ad-hoc analysis or spend days engineering a proper solution to help your client achieve their business goals or maintain your production pipeline, etc. Well then what are these mysterious wizards using their sorcery with the data!? Who are data scientists?

Let’s take a step back from our ‘normal’ state and observe; observe how we interact with the world, how we communicate with each other, how we make sense of the surroundings. Let’s say you’re talking to your best friend over the phone and telling her about your experiences in a new city you’ve just moved in. Your story may affect her outlook of the city and your recommendations may affect her preferences. She can have a rough estimate about what to expect if she plans to visit you in the new city.

Replace the ‘experience’ in the above example with ‘data’ and voila, you’re on a path to become a data scientist. Although, Data science is an ever-evolving domain with so much to learn at every step, if I were to define a data scientist in one line, I would say that

Data Scientists are storytellers with data driven problem solving approach and strong analytical skills

The Data Science Kaleidoscope

Data science is kind of an interdisciplinary field that brings together a lot different domains under one umbrella. During my career, I have identified 4 major components that makes you a good data scientist:

1. Computer Science

Whom am I kidding? Computers are everywhere, you might know very well how to operate one and use some basic applications but to process data, engineer solutions, automate tasks, visualise findings, deploy a model, tune the model, etc. you need to have some programming skills, understanding of cloud technology, production process, databases, run time, and a never ending willingness to learn new and useful technology. Two most common languages in demand are Python and R. It’s good to have a stronghold in either of them and familiarity with the other. You will most likely encounter projects that will have enormous volumes of data and/or real-time data, so learning how to deal with such data is also an essential skill. Depending upon your job, most of your Data science work will happen on cloud due to heavy computational requirements, which means you’re better off learning how to use famous cloud services (AWS, Azure and GCP); but don’t worry too much about this one, you’ll learn most of it while using them in your job :)

2. Mathematics

Isn’t maths beautiful? It’s omnipresence and construct. It’s usage might not be so apparent to a beginner but as you begin exploring algorithms, optimising models, tuning their performance, or even understanding data, and presenting the results, it’s MATHS. It’s a language of data, logic and reason. It explains patters, helps to create useful models, drive business decisions. Analytical ability and maths are intertwined. A good understanding of statistics, probability and maths behind algorithms amplifies your data science career.

3. Data Analysis

If maths is the language of data, then data analysis becomes communication with data. Data will talk to you if you know how and what to listen. Comprehending what data is indicating is challenging with real world data even with all the advanced algorithms and insane volumes of data. Even with all the right modelling, if you’re not able to comprehend the signals, it’s useless. Let’s say I tell you that the % of defective lamps produced by a company have increased by 300% for the current month. That’s definitely not music to your ears if you happen to be a part of quality assurance department of that company. But wait, let me provide you with some more numbers; let’s say the company produces 100,000 lamps per month out of which only 5 turn out to be defective. So when I say, the % defective lamps have increased by 300%, I’m saying that the absolute increase in defective lamps is just 15. What a relief, isn’t it?

4. Storytelling

If you were paying attention to the introduction, you may have somehow anticipated this component. We are storytellers; to be a good data scientist, I believe it’s important to be able to tell an engaging and apt story. If data analysis is communication with data, then storytelling is communicating with people, because in the end it’s about the people. We make business decisions for the people, create predictive models for the people, provide recommendations to the people. You need to be a good presenter, your models and analysis must be explainable to concerned people. You are bridging the gap between data and people by solving their problems.

My Experience as a Data Scientist

I hope by now you have a better understanding of the importance and role of data scientist. At the time of writing this blog, I have completed almost one year of my career as a Data Scientist. So, I thought it would be a good idea to share some snippets of my journey so far :)

I started my career in February first week as an intern. It was wonderful to work on new and interesting problems, all while learning from seasoned professionals. But yeah a few weeks in and WFH became a necessity and my room became the office. 4 months in, I joined as a full time employee and started working on a very interesting project for a Major tech giant. This project was vastly different from the projects I worked on during my internship tenure and it took me a long time to get hold of. After setting my foot in, I developed and experimented with new models, did a lot of ad-hoc analysis, coordinated with cross office teams, worked on adding new features to the models, updated our models every month, debugged codes, presented results to client, worked on new algorithms, etc.

In nutshell, that’s about my journey so far in the field. It’s a rich experience to which I’m grateful to have had and I see a great potential in the Data Science domain. Yes the automation will definitely impact this industry as well but that’s a pretty distant future, I suppose. As I mentioned above, we act as a bridge between data and humans.

Well, if this blog inspired you to become a data scientist, I’d recommend you read this blog by Jonny Brooks, just for fun… If you’re still inspired to be a data scientist, do let me know and make sure to follow me on LinkedIn as I will soon be putting together some resources and ways you can become a data scientist yourself. Cheers!

--

--

Harjot Pahwa
Analytics Vidhya

AI Engineer | Integrating AI into businesses and everyday workflows | Mentor