How I turned from totally uninterested to passionate about data science.
Ok so, from where I come, people don’t choose what they want to be until they are 18, this has always caused me difficulties, I only wanted to finish school this year and spend 3 months of summer. I have never figured out what I wanted to until I was 21, and this is the story of how I got there.
This story is a bit long because it goes back to my time in middle school. So here is the short version from phase 1:
I hated math, I couldn’t solve anything in it. A good tutor helped me pass middle school with flying colors. And he told me I had the potential for math, this has changed my life because I was more interested in history rather than math, an important factor at the time. I still didn’t consider myself good enough, but at least I was now confident I can do the math.
This is where we go to phase 2 of my journey, I passed the middle school and high school with not so flying colors, at this point of course, engineering is the best place to go, right? This is the second thing that again changed my life, because Computer Science was never an option in this family until my brother hit first job in 2013, he was the first one in the family to go CS in a family of Physicians and Engineers.
While I had my fears with engineering because of the excessive math and physics. CS was another thing, it had less to do with math and everything to do with patterns and logic, things that while abstracted seemed more real than any physics drawing ever had. And this is where I went to college.
Throughout the next 2 years of my life in college, being submerged in algorithms, data structures, objects and binaries, I still couldn’t figure out what I want. My college may have been excellent in the theoretical side of things, but it was never good on business side and practical work, meaning that I had to discover everything usable in the job market on my own, which was hard without a proper mentor-ship. I managed to get my grip on these and I now know some relevant android and a lot of web development, but my skills so far have not given me a real motivation, sure I know how to use these, but there is still no passion.
This is when I watched this little gem for the first time ever.
CGP Grey was my first ever YouTube subscription, and it is probably the best decision I took since going online. While I was going through the college’s AI course and knowing the Naive bla bla is used for classification, I thought it was just another strange useless theoretical algorithm I will never use. Until I saw CGP Grey scrapping away linear algebra and simply explaining what classification means and how it is used by bots to improve Youtube algorithm performance. This video has caught my attention.
I never realized how causally this concept of Machine Learning became used until I attended the graduation project presentation for the year 2018, where almost everyone used Machine Learning in some capacity or another, 3 projects discovering cancer, one making an excellent OCR system, another discovering brain wave patterns. A concept that I have perceived to be complicated and meticulous, turned out to be casually used by everyone to get a passing grade. THIS BLEW MY MIND!!!!!
So 3 months later it was my turn to begin my graduation project, during these 3 months I turned to Udacity for help, and following the trend I went to study Machine Learning with Sebastian Thrun.
I have taken this effort to be just a way to pass the graduation project, I did find machine learning to be interesting, and while the statistics and math behind them are indeed difficult, conceptually, it was simple as a breeze.
But when I searched for GP ideas, all Professors refused to supervise them, because last year’s ML debacle made the professors more strict about how the project should work, no more simplistic workflow pipelines, whatever you make it has to be extremely creative or complicated and is not a simple so called classifier. So the android dermatology detector was no longer a viable idea.
I would like to hold silence for a moment for the craziest idea I had:
— Sentiment analysis through voice pattern recognition using image processing.
Don’t ask how. It was too crazy to explain.
When I explained this idea to my eventual supervisor, she told me to ignore image processing to begin with and maybe look into the ideas presented by the supervisors themselves. And so I chose her idea:
Machine Learning for detection of adverse drug reactions through social media.
This project was my real gateway to data science, I never heard of the term until then, afterward I began to receive ads like this.
Searching for a data set for this project must have triggered the Youtube algorithm for ads, it was indeed a spot on conclusion by Youtube.
The reason I was never motivated for anything until this point is that I never had a motivating project that makes me work on something. One thing I was surprised everybody loved was Java, because of two big projects we made early in college, even I was more comfortable using it compared to anything else, but “Java” is not a role, a job title or even a science, it was just a tool.
My GP was my Java for Data Science. I had to crawl data, clean them, understand them. Find the statistical points that matter, those that don’t. I had spent about a month just making sure the dataset is usable and understandable. I created ways to extract the data I needed such as age and gender from raw data. All of that took me around two months. AND SO FAR I HAVEN’T EVEN TOUCHED MACHINE LEARNING. The ML part only took me two weeks to finish. It is at this point that I realized that machine learning was only one small part in a bigger picture- Data Science.
I got an A+ in this project.
At this point I became convinced that data science is the way to go, it is deeply related to other interesting fields like cloud computing and big data. It can be used by both industry workers and academics if I so wish to work in either of them. The search to understand and properly use data was both the most challenging and the most fun I have ever had in my life. And even better, it is a small lake in the AI, machine learning and deep learning, yet so integral on how they work, I cannot properly learn any of them without properly understanding data science.
And this was the problem.
When I began to seriously learn data science beyond my naive pre-graduate student approach to a more professional one, I couldn’t find proper guides, I was just told: “Learn Math, statistics, machine learning”. I am a very focused learner, and I don’t like to move from one course given by some university to another course given by another university, to a third course given by some big companies like IBM or Amazon. I knew it would be a very hard track to follow, and that I would have to un-focus my learning method and begin to look into non program-ized content and basically just collect puzzle pieces from this and that. That is until I found the Bertelsmann Scholarship.
I remember writing some passionate speech about why I should be considered for the scholarship, I never thought I would be actually accepted, but alas I was. At first, I was very excited and I wanted to finish it quickly. But life happened. I got stuck in other matters.
I am now one month away from passing phase 1 and I am only 40% into the material (65% actually, although I skipped through most of SQL, so I don’t count it). Wanna know the real challenge? It will be trying to finish the course in less than a month which should take 4 months.
And this is where my story stops. It is not finished, it is still one month away from one more chapter in my life. You think I can pass to phase 2? Maybe. I will only find out when I actually finish the content, and hope that life won’t stop me again, my passion 3 months ago is stalled but never burned out, and I will finish what I have started.
Have a great day
Update: I passed :D