From Ballerina to AI Researcher: Part I

The beginning of my journey as an OpenAI Scholar

Sophia Aryan
BuzzRobot
4 min readJun 9, 2018

--

Last year, I published the article “From Ballerina to AI writer” where I described how I embraced the technical part of AI without a technical background. But having love and passion for AI, I educated myself and was able to build a neural net classifier and do projects in Deep RL.

Recently, I’ve become a participant in the OpenAI Scholarship Program (OpenAI is a non-profit that gathers top AI researchers to ensure the safety of AI to benefit humanity). Every week for the next three months I’ll publish blog posts sharing my story of transformation from a person dedicated to 15 years of professional dancing and then writing about tech and AI to actually conducting AI research.

Finding your true calling — the key component of happiness

My primary goal with the series of blog posts “From Ballerina to AI researcher” is to show that it’s never too late to embrace a new field, start over again, and find your true calling. Finding work you love is one of the most important components of happiness - — something that you do every day and invest your time in to grow; that makes you feel fulfilled, gives you energy; something that is a refuge for your soul.

Great things never come easy. We have to be able to fight to make great things happen. But you can’t fight for something you don’t believe in, especially if you don’t feel like it’s really important for you and humanity. Finding that thing is a real challenge. I feel lucky that I found my true passion — AI. To me, the technology itself and the AI community — researchers, scientists, people who dedicate their lives to building the most powerful technology of all time with the mission to benefit humanity and make it safe for us — is a great source of energy.

Credit to www.martineauarts.com

The structure of the blog post series

Today, I’m giving an overall intro of what I’m going to cover in my “From Ballerina to AI Researcher” series.

I’ll dedicate the sequence of blog posts during the OpenAI Scholars program to several aspects of AI technology. I’ll cover those areas that concern me a lot, like AI and automation, bias in ML, dual use of AI, etc.

Also, the structure of my posts will include some insights on what I’m working on right now (the final technical project will be available by the end of August and will be open-sourced).

I feel very lucky to have Alec Radford, an experienced researcher, as my mentor who guides me in the NLP and NLU research area.

First week of my scholarship

I’ve dedicated my first week within the program to learning about the Transformer architecture that performs much better on sequential data compared to RNNs, LSTMs.

The novelty of the architecture is its multi-head self-attention mechanism. According to the original paper, experiments with the transformer on two machine translation tasks showed the model to be superior in quality while being more parallelizable and requiring significantly less time to train.

More concretely, when RNNs or CNNs take a sequence as an input, it goes through sentences word by word, which is a huge obstacle toward parallelization of the process (takes more time to train models). Moreover, if sequences are too long, the model tends to forget the content of distant positions in sequence or mixes it with the following positions’ content — this is the fundamental problem in dealing with sequential data. The transformer architecture reduced this problem thanks to the multi-head self-attention mechanism.

I digged into RNN, LSTM models to catch up with the background information. To that end, I’ve found Andrew Ng’s course on Deep Learning along with the papers extremely useful. To develop insights regarding the transformer, I went through the following resources: the video by Łukasz Kaiser from Google Brain, one of the model’s creators; a blog post with very well elaborated content re: the model, ran the code tensor2tensor and the code using the PyTorch framework from this paper to “feel” the difference between the TF and PyTorch frameworks.

Overall, the goal within the program is to develop deep comprehension of the NLU research area: challenges, current state of the art; and to formulate and test hypotheses that tackle the most important problems of the field.

I’ll share more on what I’m working on in my future articles. Meanwhile, if you have questions/feedback, please leave a comment.

If you want to learn more about me, here are my Facebook and Twitter accounts.

I’d appreciate your feedback on my posts, such as what topics are most interesting to you that I should consider further coverage on.

--

--

Sophia Aryan
BuzzRobot

Former ballerina turned AI writer& communicator. OpenAI alumni. Fan of astrophysics and deep conversations. Founder of BuzzRobot