Deep Feelings about Deep Learning

So I want to build Artificial Emotional Intelligence (AEI), and I already wrote about a possible application to treat mental health problems. Even the big guns like Apple Inc. are trying to build AEI (for some obscure reason). So the obvious step when you want to build something is to study and to do research.

As much as I tried not to fall for the hype recently gained by Deep Learning, I could not really resist to explore their promises. Let me quickly explain. In order to build real AEI I wanted to start by the component that can understand our words. This belongs to the fields called Natural Language Processing (NLP), and Computational Linguistics (CL). Building powerful and useful NLP/CL systems is extremely challenging. It took me nearly 3 years to build a system that can guess your emotions from what you write, and the accuracy is far from perfect. The reason is that such systems are traditionally built using manually defined rules, features, and algorithms tailored for specific tasks.

Deep Learning, on the other hand, promises to replace handcrafted features with efficient algorithms able to “learn” the features automatically from some input data, saving you all the hard work. So yeah! When you think about this it makes sense to want to give it a try. And so I did. First I studied the basics of Artificial Neural Networks using the awesome Coursera Machine Learning Course. Then, to complement that knowledge I read this great online book, and checked these fantastic video tutorials. All that taught me to play with toy Deep Networks on code fully written by me. When I was ready I jumped to TensorFlow, a full-fledged Deep Learning software library and followed their tutorials to train Deep Networks to classify handwritten characters. My reaction? A rush of elation followed by a bit of disappointment.

Don’t take me wrong, Deep Learning is awesome. There is mathematical proof that in theory they can solve any problem. The handwritten characters classification tutorial, although simple, hints to that. Yet there is something about Deep Learning that leaves a sour taste in the mouth. During my previous research project, I always felt I was in control, and in most cases I could justify why things worked. With Deep Learning, it all felt like magic. Except for the valid mathematical intuition, you can’t really understand what’s going on inside the black box that is the constructed Networks. Moreover, even the state-of-the-art systems where constructed in an empirical way, by testing different network architectures until finding the best performer, with little clue of why it performs better.

So yes, Deep Learning can solve complex problems, yes it can save time and effort, but without clear understanding of what is going on inside, it might lead to many frustrations in the process. As soon as I move past the tutorials and into developing the first part of my AEI, I will post more about my feelings towards Deep Learning.

Note: I am not a native English speaker nor a profesional writer. I use English as a universal language to tell and document this journey I am about to embark on. I apologize if in any way I’m killing the beauty of this language with my grammar and spelling errors.