My Data Science and Machine Learning Journey at 42

Saurabh Saha
7 min readAug 1, 2022

--

When I decided to enrol myself for an online masters at Purdue, I had no idea what it would take to do it. But before that let me give a brief of who I am.I am a 42 year old dude who doesn’t look that age but doesn’t look young too. I am pretty much what Britney would have tried to explain if she ever sang a male version of,”I am not a woman, not yet a girl”. Anyway before you start thinking about Britney, let me take you straight back to what all I have done. I believe my LinkedIn does a pretty decent job of explaining who I am. I mean I spent months making it look decent so I’ll just provide a brief summary of what I’ve done.I have spent close to 2 decades in the technology industry doing various kinds of roles around tech and product.I have also done a bunch of inorganic activities which most people won’t ideally do but then what’s life without a risk. The reason I enrolled myself for a course in AI was purely because I was curious. There you go. I spilled out the last dark secret I had in. my closet. Well will the Schrodinger’s cat now die or stay alive or do both?

Well to give a better explanation, although I have been in technology for a long time, I really haven’t had the chance to explore what AI is. I mean have read boatloads of articles, blogs and books on AI and machine learning and grew up on Science Fiction like ‘Doctor Who’ ,’Fringe’ ,‘Her’ or ‘Iron Man’, I felt I did not really know what AI is except the boundaries of what everyone knows. I didn’t want to be the guy at the stands waving at every technological innovation around AI instead of being at the centre stage. I wanted to understand what machine learning is, what deep learning is, how does one sift through large datasets and how can AI be used in solving problems. I mean reading about OpenAI’s GPT3 and GPT4 writing text and code on it’s own was a miraculous feat and would be super interesting for a curious bloke like me to understand the nuts and bolts of how AlphaGo beat Lee Sudol at a game that he mastered over several decades. It’s hard to believe that at an age where people are planning their retirement and doing executive MBAs to grab the next big leadership role in a tech company, I was just in the middle of satiating my curiosity.

Once I was confident I want to do it inspite of leaving the entire process of writing code for close to 7 years which practically makes you a freshman, I went ahead and enrolled myself. I enrolled myself sometime in July 2021 for a one year degree at Purdue’s online PGP Program at Simplilearn reluctantly. My classes started instantaneously. The entire program was divided into 7 courses that one needed to take to crack the program. I took the first and the second courses easily since I had spent a decade as a programmer and it was basic programming courses. Then we moved on to Data Science with Python. This was a difficult one at least from my standpoint because it employed a lot of statistical concepts that I had long forgotten. On top of that it employed the use of data structures like Numpy and Pandas that I had never even heard of. There would be 8 or 9 of such libraries which had thousands of functions. I would attend classes and try to understand as much as I can but believe me it was a complete struggle. So the first time inspite of attending all the classes on weekends and trying to do the assignments by myself, I was actually struggling to understand what on Earth is happening. I mean deploying pandas or numpy over a dataset is the easy part. The difficult part in data science is to figure out if you’re doing the right thing or not. You could come up with an answer but you wouldn’t really know if the answer is correct or not. More so the kind of confidence a programmer has over a said technology was completely missing from the picture. I also felt that the trainer was more interested in giving folks code to execute than actually explain how to solve a specific category of problem. Anyway it didn’t really matter now because after the classes ended, I just couldn’t complete the assignment.

Then I opted for another session of the same course. SL was kind enough to provide me with another session and this too wasn’t much different because the trainer would teach something and I wouldn’t be able to solve problems. I tried all the tricks in my book to solve data science problems but I couldn’t. The same thing happened again and I gave up. Quite similar to what happened earlier after the passage of a month I still couldn’t complete the assessment problems and once again I was exactly at the same place where I was earlier. I was a little sad that inspite of being a programmer for over a decade I wasn’t able to understand what’s going on. It kinda breaks you and I was pretty much in that zone. Everyday I had thoughts that I am a dud. It seemed my technology career was some kind of a sad joke. If I could not crack DS assessments twice consecutively, there was something wrong with me. Did I grow dumb in 7 years. I mean how could a guy who worked as a programmer for 10 years not able to crack simple DS problems? The next few weeks went by in introspection and contemplation. I read a bunch of interviews of people who had struggled their way in data science till they built enough confidence to crack problems. What was missing? I mean I was attending classes and following everything that the instructors were telling.

That’s when I decided to take another shot at it. I rebooked the data science course for the third time and took it. Now I would do the problems in the class with the instructors and ask any question that I had in my mind about the particular concept be it statistics, machine learning or plain python. After the instructor would give me the answers I would implement the solutions immediately. That helped clear a few doubts in my mind. I started seeking videos, blogs, articles or courses on Datacamp or Sharp Sight. I additionally started touch basing problems on Kaggle solving them and started sharing my code publicly for review and people would come back and tell me a bunch of exciting things. I also found that Stackoverflow is a paradise for developers. Anytime you have a doubt and post your query, people would be kind enough to help you out.I also realised a few things about how once can understand data science or machine learning(which was incidentally my next course).

The things is Artificial Intelligence actually has two components- a bunch of algorithms that help one make systems learn things and then be intelligent enough to make decisions on their own or make predictions and extremely large sized datasets that might be unstructured and transformed to a format deemed acceptable to be used by the said algorithms. Most of the work in AI is around data wrangling and once it is done, one can subject the datasets to the said algorithms in machine learning, deep learning or neural network. I am yet to take courses on deep learning and neural networks so forgive me if some of my statements sound incorrect. Now apart from practising the problems using Pandas or Numpy or Scikitlearn or Matplotlib or any other library, one needs a decent understanding of these libraries. I took this course by Jose Portillo which helped me understand these libraries and several functions and objects they contain better. Along with it it is also important to understand some of the advanced concepts in Python for which I took this course which helped me understand how I can use some of the python statements for data wrangling or data visualisation. Once I was confident enough, I moved to implementing them to solve problems using the said data structures and programming constructs. Today I wouldn’t say I am a pro but at least I have a fair understanding what data science contains and what the how the field of machine learning is essentially changing business forever.

I am down 3 courses and about to take the 4th one on Deep Learning using Tensorflow and Keras. I am not sure how I would perform there but I am optimistic I will reach somewhere now that I have a fair understanding of the fundamentals. But from my own experience I could sense that a lot of people my age would undergo massive frustration in the beginning because there is a lot of statistics, linear algebra and a lot of programming concepts that one needs to understand to be able to solve customer problems. BTW I believe the problems I solved might not have the same level of complexity as a real world problem so I am yet to learn how to tackle those problems. Frankly most of these algorithms written for massive datasets need to be computed using multicore processors and frameworks like Spark which I have no idea about. I did attend one lecture on Hadoop but it wasn’t taught in depth as a later course would cover it. So I would have probably covered some 30% of AI by now. More than anything what I learnt from this experience is that you should pursue the things that you are scared of pursuing because it expands your world view and your knowledge base by several notches. ‘Never give up’ is a scientific method to overcome the path to least resistance to attain your own Nirvana. I hope my learnings till date might help some folks walking the same path and feel free to connect if anyone has any queries. Would love to help in any way possible.Cheers :)

--

--

Saurabh Saha

Thinker, Philosopher, Technologist, Product Leader and still a human :)