Albertus Kelvin, Turning Highschool Dream into Reality

Published in

Nurture.AI

5 min readFeb 22, 2018

Albertus implemented 2 NIPS 2017 papers — “An Automated System for Essay Scoring of Online Exams in Arabic based on Stemming Techniques and Levenshtein Edit Operations” and “Plagiarism Detection on Electronic Text based Assignments using Vector Space Model”, and is a winner of the Global NIPS Paper Implementation Challenge. See his code implementations [1][2].

Tell us a little about yourself?

I’m an undergraduate student majoring in Computer Science at Bandung Institute of Technology (ITB), Indonesia. Currently working as an assistant at the Graphics and Artificial Intelligence Laboratory, ITB. My research interests are all about Natural Language Processing (NLP). I’m also interested in combining NLP with other fields, such as Program Synthesis. I had the opportunity to work on this combination while I was doing my summer research internship at the National Institute of Technology, Gifu College, Japan.

In my spare time, I love reading any books but I always have a place for mathematics and algorithm books — they help with enhancing my abstraction skill and creativity. I also love learning new technologies from the internet. Mostly, I learn by reading research papers related to my interests. I found this activity useful since I could get insights on the methodology which had never been thought of before. Hereafter, I applied the methodology to implement the paper. Besides Science and Technology, I am also interested in education, environment, health, and human rights.

How did you get started in AI research?

I started getting really interested in NLP when I was in high school — while participating in a programming contest, I thought it would be cool if I could build a system that can receive any programming problems as the input and generate the source code as the output. It then became my biggest dream ever, and I’d always been trying to find more information about the engineering behind that system. Came the day I had to choose a research topic for my research internship, I jumped on Automatic Programming quite naturally. Automatic Programming system could generate the relevant source code in certain programming language based on user intention — in another word, the programmers only need to provide problem specification in natural language, and the system would generate the desired source code.

I started searching for relevant research papers on this particular subject. I found a paper discussing about the feasibility of automatic end-to-end source code generation system. I found that there were several main problems around building such system, such as how does the system work to understand the user’s intention in order to generate the desired source code, and how to ensure that the user receives the correct source code representing the solution to the problem statement. The goal for my internship was to simplify things — I developed an automatic programming system by combining NLP and Deep Learning. Precisely, I used several NLP techniques to understand user intention, and implemented the Long Short Term Memory (LSTM) networks and few of its variants to generate the source code.

What are you most passionate about in AI?

Related to my research work on the combination of NLP and Automatic Programming, my primary concern lies in improving programmers’ productivity. In the world of software engineering, automatic programming system has been a dream for many as its automation would allow programmers to focus on more proactive, creative and strategic activities.

Automatic programming is widely believed to be a necessary component of any intelligent system. NLP methods would enable us, humans, to interact with machines (robots, computers, drones, etc.) in the similar way we do with our human friends. To have such natural interaction, machines would need to understand humans’ intentions which reside in written or spoken statement, and to do so, machines would need to be taught to understand the many aspects of human language. For instance, machines need to resolve the ambiguities contained in human language in order to get the right meaning. One of the interesting solutions is by using the question-answering technique, whereby machines could ask humans other questions in the purpose of searching for the missing information.

All in all, I am passionate about solving NLP problems, as well as seeking powerful combination between NLP and other fields.

Can you give us an overview of your implementation in the Challenge?

The first paper “Plagiarism Detection on Electronic Text based Assignments using Vector Space Model” was about developing an effective plagiarism detection tool for text based assignments by comparing unigram, bigram, and trigram of vector space model with cosine and Jaccard similarity measure. In implementing the first paper, I used Python (v. 2.7), scikit-learn, and NLTK.

The second paper “An Automated System for Essay Scoring of Online Exams in Arabic based on Stemming Techniques and Levenshtein Edit Operations” was about developing an automated essay scoring system in Arabic language for online exams based on stemming techniques and Levenshtein edit operations. I implemented with Python (v. 2.7).

Were there any challenges while implementing your selected paper?

Fortunately, there were no specific challenge while implementing the first research paper on plagiarism detection. The methodology was clear and provided necessary formulas and definitions. I can easily write the code based on the sequence of steps listed in the methodology.

There was a little confusion while implementing the second paper on automated essay scoring. It was about the given formula for calculating the similarity score between two texts. The initial formula kept giving zero for the final result. Based on my analysis, the problem lies in the numerator of the similarity score formula. The resulting score was always classified into one condition making the final result to be zero (let’s call it as zero condition). To overcome such issue, I amended the similarity score formula so that the resulting value wouldn’t always be classified into zero condition. Since the similarity score had already been on the right track, the final result seemed to be plausible.

What’s next for you in your work?

There are a lot of interesting applications that are powered by NLP. For I’m still curious about the future offered by Automatic Programming, I would like to do more explorations on this subject. My next possible target would be to develop a system that can recognise the sequence of steps (algorithm) and generate the corresponding source code. The algorithm would be stated by utilising speech. Therefore, the system would also implement the Automatic Speech Recognition (ASR) technology to accomplish the task.

Last but not least, I would like to venture into the world of Computer Vision. I would like to develop a system that can receive a design sketch (in the form of image) and would generate the corresponding HTML code. In fact, it’s very similar to one developed by Uizard Technologies, in which they develop an automated front-end development system. By leveraging on the capabilities of computer vision, designers can focus more on creating beautiful websites without worrying about programming the front-end code.

Albertus is a Computer Science student at Bandung Institute of Technology, Indonesia. To keep up to date with Albertus, check out his Github and LinkedIn.

This is a feature of the winner of the Global NIPS Paper Implementation Challenge. You can read other winners’ feature here. Let us know if you enjoyed this series and would like to see more of content like this, drop us a comment or an email at info@nurture.ai