My HackHarvard Experience: (AKA How I Learned to Stop Worrying and dove head-first into Deep Learning)
I’m currently a rising junior at SRM University; An institution not many living outside India have heard of. Exactly a year ago, I was the first person to join a lab started by Anshuman Pandey and Aditthya Ramakrishnan. They had just returned from spending a semester at MIT (Massachussetts, not Madras or Manipal xD) and wanted to start a lab inspired by the place where they interned. Since then, I went on to spend a semester at University of Wisconsin-Madison, interned at the Informatics-Skunkworks lab where I built neural networks to classify the behavior of steel in pressure vessels, built prototypes using bleeding-edge hardware and software and most recently, take charge of this amazing lab as Aditthya and Anshuman leave college.
In my sophomore year I took Andrew Ng’s course in machine learning and that got me really interested in this field. I am currently an undergraduate researcher at Next Tech Lab.
What does the lab do?
Next Tech Lab is a crazy place. To get inside the lab, you’ll need an RFID (or probably a member’s severed thumb if you’re using the fingerprint scanner) to get through FREAK: a patent-pending electromagnetic door lock (using Face Recognition among other unlokcing mechanisms) invented by the IoT & AI (Minsky) Labs over a weekend. Then you gotta use your phone/laptop to turn on the lights. Most of the time you find people hacking away through the night, zipping through the corridor on a hoverboard, having intense discussions while scribbling multicolor chalk equations or gathered around to listen to a weekly talk by guests from every sphere of tech possible. In between all the brainstorming and skateboarding, people jam with their instruments. With over a 100 members, fusion is integral and you can for instance see people playing electric guitar with a mridangam backbeat or say Beatboxing through an amp with freestyle rap verse. Blown capacitors, psychedelic posters, arduino boards, large cardboard prize cheques and medals and most importantly, people taking naps — This lab is full of smart and quirky people with intense pop culture leanings and mainly, a common love for science and engineering.
Finally about the experience and HackHarvard
My friend Rohit was a special student at MIT and I was a visiting student at University of Wisconsin- Madison. We got accepted to HackHarvard, 2016 where we built this amazing hack-Deep Sense.
In the last few decades, due to the advancement of technology, sharing media has been simpler. People communicate with each other not only through text but also through pictures, videos, and audios. We, humans, have the tendency to derive information as much as we can, given a source, but sometimes even we miss out on capturing the small details. To be more precise, we keep finding different ways to express and interpret our thoughts and feelings.
We strongly believe that Artificial Intelligence will help us automate this. Deep Sense was thus created ! Deep Sense was built in under 36 hours at Harvard University, 2016.
Using the power of today’s cognitive services, we aim to extract information from an image: Emotion and Objects present and using that data, we use multiple trained models to generate music and text which works in a codependent way alongside with the images. These models are trained using some state-of-the-art algorithms like Recurrent Neural Networks and Restricted Boltzmann Machine.
The Recurrent Neural Networks are trained on some eminent poets-Shakespeare and Robert Frost to name a few. We trained a multilayer RNN-LSTM to do this. It is trained on character-level and thus tends to be more precise. Moreover we used a sliding window of 40.
We also created a deep layered stacked Restricted Boltzmann Machine for monophonic music generation pertaining to different emotions-happy, sad, angry, fear. Adam optimizer seemed to work nicely in this case. Restricted Boltzmann Machine are a kind of generative nerual network that learn how to reconstruct the input. Yeah, an Encoder Decoder model could also be used in this case.
Thus using this architecture we were able to describe an image using a poem and music based on an artist’s style.
The backend(containing the neural nets) was made using Python-Keras and Tensorflow.
Finally the frontend was built using Swift by our other teammates-Ahmed and Fabian. The Swift frontend and Python backend were integrated using the Flask REST API.
Article by Fenil Doshi, Founding Member and Researcher in Minsky Lab at Next Tech.