Growing your career as a Data Scientist

Gopi Krishna Nuti
Analytics Vidhya
Published in
4 min readAug 16, 2021

Now-a-days, there are two standard questions I am being asked by juniors. (Guess, I am getting old). If they are not in ML, “How do I become a Data Scientist?”. If they already are, “How do I grow in my career as a Data Scientist?” I will skip the first question for another day.

If you are interested in the 2nd question, here is my take on it.

ML, as a career, comprises of 3 aspects.

  1. Maths Algorithms
  2. Domain Knowledge
  3. Software Engineering

If you want to enhance your value to your organization and make yourself indispensable, you should focus on at least one of the three.

Maths Algorithms

Learn more and more algorithms available out there. Don’t limit yourself to the algorithms learnt in your coursework (Udemy or college or elsewhere). What you learnt there is the bare minimum starting point. Keep pace with a) newer algorithms. Learning the mathematical foundations of the algorithms will build a great knowledge base for improving your ability to do original research.

If you are not into original research but still want to learn this area, I recommend you focus on Machine Learning interpretability. Deep Learning is a black-box. Statistical models are equally hard for a non-mathematician (read Product Managers, Sales teams etc) to understand. Focusing on interpretability makes a great value addition. Instead of saying “this is the model, take it or leave it” you can actually explain the model to satisfy their curiosity. Add a few soft skills and this makes you the “go to” guy when discussing ML projects, status and road maps.

This is quite a vast field. Invest time in ML interpretability, Mathematics, Data augmentation and other such aspects. The rewards are rich and makes for a formidable weapon in your repertoire.

Domain Knowledge

This is a second are which provides rich returns for the knowledge gained. Data Scientists contribution to the project increases with domain knowledge. IT in general, and ML in particular are fields where the technology fades into the background and application of technology takes a much higher precedence. In many other technologies, user behaviours can be expected to change to accommodate the technical limitations to a higher degree than in IT. ML is even more demanding in this perspective. So, domain knowledge becomes an indispensable aspect for being a successful Data scientist. If you don’t understand how the data is used by the end user, you cannot make meaningful impact on the solution other than randomly trying out algorithms one after the other. In such a scenario, data ceases to be a treasure trove and instead becomes a sequence of bytes. Every stakeholder will come to you and give his take on the solution and you will end up clerking it. Believe me, that is not fun.

When learning domain, do not hesitate to get into the details like labelling or going through the training programs meant for beginners in that domain. Be patient and sow the seed of learning. It will take time for your learnings to bear fruit. But when it does, they will juicy and sweet.

Software Engineers

This one is the easiest to crack. Data Scientists often make the mistake of focusing on the algorithm without worrying about the bigger picture of solutioning it. I once knew a stakeholder who was obsessed with achieving best possible accuracy for his classification problem. The resultant solution was unwieldy and could not be realized into a deployable solution. And finally when deployment was cobbled up, the cost requirements did not justify the Return Of Investment (ROI). You don’t want to be that guy, do you?

So, be familiar with engineering aspects of the solution as well. Understand how CI/CD/CT work. You need not be an expert. There are architects who will guide you on this. But if you put a blank face in response to questions like “what is the latency?” or “how are you handling exceptions” then, you will cut a sorry figure. At the end of the day, when all is said and done, ML is only a tiny piece of your overall solution. It could be the most crucial piece. The entire solution could have been designed for using that algorithm you built. But that still does not preclude the need for you to be cognizant of good coding practices and of software engineering aspects in general.

Conclusion

Each of these three areas are important for a Data Scientist for their own reasons. I generally recommend that one should be good in any 2 of these three. But you should be good at at least one of them. If you don’t focus on any of them, then you will end up being a pseudo-data scientist and will be replaced by AutoML or drag-n-drop tools that are maturing real fast. The next fresher out of college can and will replace you. If you don’t want that to happen, then roll up your sleeves and get your hands dirty in (at least) one of these three areas.

I will quote Robert Jordan’s Wheel of Time here. “Being Aes sedai means you are truly ready to learn”. Being a Data Scientist means you are truly ready to learn. Embark on that journey with gusto. If you come across other ways of increasing your worth as a Data Scientist, then please do share at ngopikrishna.public@gmail.com

--

--