Winter 2018 @ Aspiring Minds’ Research Lab
I was done with my 7th Semester college exams by the end of November 2018, and was looking to work on something interesting. I had an internship offer from Aspiring Minds’ Research Lab in Gurugram for the role of Intern — Research and Development, for which I had applied a couple of months ago. I can easily say that it was the best choice, since not only did I get to work on Machine Learning ( my collegelong dream :) ), but I also met some of the brightest minds there. I chose to intern at Aspiring Minds’ Research Lab as I had heard quite a lot about their Research & Development (R&D) Team. I went to Gurugram on 1st December as my internship was about to start on 3rd December for a duration of 2 months. (3rd December — 2nd February)
Aspiring Minds is a global leader in job skills assessments and credentialing. They envision a merit-driven talent ecosystem enabled by efficient job skills matching enabled with reliable and intelligent assessments. Powered by machine learning, AI, psychometry and statistics, their state-of-the-art assessment tools are used by companies across a wide variety of industries to help them recruit the right people, develop requisite skills benchmarks, and to assess workforce health.
Aspiring Minds’ Research Lab is truly a heaven for people passionate about Machine Learning. Not only have they created a stir in the Research fraternity, but Research Papers published by them garner citations from all around the globe.
Aspiring Minds has been doing Machine Learning, aka Artificial Intelligence, for 11 years now, much before it became a vogue. They solved original problems using AI, not copycatting the West — doing a lot of firsts in the world. Their Assessment Solution — AUTOMATA launched in 2013 is world’s first ML based Coding Assessment. In 2016, they also created world’s first automated motor skills test. Apart from this, they also launched ml-india.org, the first effort to audit India’s ML activity.
The first day was spent mostly in paperwork, followed by an introduction to the team members of R&D. There, I met my Mentor, Abhishek Unnam (2 years my senior from DTU :) ) & Rohit Takhar who is the seniormost member of R&D at Aspiring Minds’ Research Lab. I also met my other teammates — Manish, Abhishek and Aditya. It was the first time, that I was going to work on a major project involving Machine Learning. I was briefed about the tools that the organisation used (Jupyter Notebook being one of the many indispensable tools :) ) and was given a background on the project that I was going to be involved in. Needless to say, I was excited to get my hands dirty :)
My work basically revolved around Aspiring Mind’s flagship product — Writex.
WriteX helps in assessing the overall quality of a candidate’s written sample. The sample could be an essay that the candidate has been asked to write on and automatically grades him or her on the quality of the write up.
It takes into consideration all major factors that attribute towards the construction of a great write-up like grammar, usage, mechanics, style, organization and development etc. Writex tool is capable of automatically identifying and reporting multiple grammar errors, typographical and spelling errors. It uses NLP to evaluated content score, structure and grammar for the text written by the candidate.
There are separate tests for various job roles, one of such tests includes evaluating the e-mail written by a candidate. Up until now, the evaluation of e-mail was based on the correctness of Subject, Heading, Salutation and Valediction. Writex was used alongside, to evaluate the content quality of the email.
But our assessment was missing a key factor — The flow of semantics was not being considered. Let me explain, what I exactly mean by the term “flow of semantics”. Imagine a situation where the candidate needs to frame an email where he has to pitch an idea. It is obvious that in such a scenario the candidate must refrain from using sentences expressing apologies, sentences with an intention of complain etc. At the same time, sentences expressing apologies must definitely be used in scenario where a customer needs to be pacified or sentences with an intention of complain must be used when an email complaining about a faulty product is being written. Thus, a better e-mail is the one with the right kind of sentences used at appropriate places.
My work was to design the entire system using Machine Learning (i.e., to create an ML model) for evaluating the flow of semantics in e-mail as well. We named it “evaluation of flow of semantics in an e-mail”. Without going deeper into the nitty-gritty of what I did and how I did it…(I will leave that for another post), I can gladly say that my model is now being used in tandem with WriteX to evaluate e-mails.
I utilized several model/techniques for my work. Word2Vec, Bag-of-Words, TF-IDF, n-grams, Normalized Count Vectorizer to name a few. I had to carefully clean and analyze the dataset and do R&D to build this system, all thanks to my mentors Abhishek Unnam and Rohit Takhar for helping me in designing that System to evaluate the candidates better.
What did I get to learn during these 2 Months ?
The learning was more in terms of the skills that I acquired and the techniques that I learnt, so I imbibed a great deal each day, during each part of the process, be it Data pre-processing, Feature Engineering or Modelling.
Let me say this beforehand, that I fully realize that Machine Learning, today, is an interdisciplinary field within the Mathematical and Computational Sciences. But nevertheless, I came to a conclusion that Machine Learning, for me, is more of Mathematics (in general) and Statistics (in particular), than Computer Science. In Layman’s terms, we can say that our ultimate objective is to map an input (x) to an output (y) using some function (f) and our entire work revolves around finding such a function f. [f(x) -> y]
Machine Learning originated in 1950s within statistics, which is a branch of mathematics, and many of the most commonly-used algorithms extend statistical models. However, the field has grown to include more Computer Science/AI algorithms, and a more extensive use of CS has hence been made. Graph-based algorithms, Reinforcement Learning being based on Dynamic Programming (as first described by Bellman) are a few such examples of CS algorithms being actively used.
Now, let me discuss some key takeaways w.r.t. the Project
Problems based on Text Processing tend to be much more complicated, as was evident from our dataset.
There is no one-algorithm-fits-all situation and the only thing that one must do is to try and cover as many edge cases as possible.
I learnt the A-Z’s of Text Processing, I am pretty sure after this internship I will be able to take up any project on Natural Language processing and carry it out without any sort of hand-holding. Spam Review detection etc., the possibilities are endless.
Code Modularity is a property which must be introduced in the project as early as possible. Things tend to get cluttered and easily get out of control if such Software Development practises are not followed.
What other things did I realize ?
PERSEVERANCE — Perhaps the most important attribute of a successful researcher is perseverance. I faced this situation several times, where it looked like the performance has saturated and the model could not be improved any further, but with the proper guidance, I learnt that one needs to persevere and keep on researching & experimenting to eventually sort things out and improve the ML model.
My college life is about to END :( — I need to enjoy as much as I can before I enter into the industry.
My Review of Aspiring Minds
Confucius rightly said, “If you are the smartest person in the room, then you are in the wrong room”. Being the only intern at Aspiring Minds during those 2 months, I was surrounded by people who had been working in the industry for quite some time. As a result, everyone was superior to me, both in terms of experience and industrial knowledge. Needless to say, I tried my best to imbibe as much of it as I could. Parv and Sarthak taught me a lot about Software Development practises that ought to be followed, how to dive into a large codebase etc. During the course of my intern, I faced several challenges while handling the datasets and Abhishek, Aditya and Manish were always ready to guide me, and explain to me, the intricacies of Pandas Dataframes. These are a few of many such instances which I can recollect off the top of my head.
The office atmosphere was perfect with the right kind of work-life balance. It felt as if I was part of a big family, rather than being an employee in the company. Every event, be it a Birthday, Work-Anniversary or Last working day, was celebrated with cake cutting and get-together.
I definitely feel lucky that I got a chance to work with such an awesome company.
It was a great experience, interning at Aspiring Minds’ Research Lab. I highly recommend anyone, planning to work with a great Team. Thanks to all the people whom I met there (Rohit Takhar, Abhishek Unnam, Manish Rathi, Abhishek Singhania, Aditya Gupta, Parv Jain, Sarthak Garg), I learnt a hell lot of stuff in that short term intern. I got to look into the life of a Researcher and the kind of perseverance required to carry out Research, all thanks to the team who helped me in climbing those steep hills throughout the journey. And last but not the least Thanks to Toshika Chauhan, Abhishek Unnam, Arpit Jain and Abhishek Pandit and all my teammates for the last moment Farewell!
PS: Rohit had to leave for Honolulu, Hawaii to present his Research paper in AAAI19 conference, and was missed during my farewell party :(