Translation, Deep Neural Networks and Limitations

As machine learning becomes more and more embedded in our life we begin to see its potential. Machine Learning is already an incredibly powerful tool for image recognition and classification. Machine Learning has been proven to work very well with data fitting and many examples of this exist in our day to day life. Spam filters would not be a thing if it were not for Machine Learning. The biggest problem with neural networks is their weakness at understanding context and being unable to understand nuance.

One of the biggest goals is to create a reliable translator. This sounds great in theory. Teach a machine to map one word to another in a database. Simple, Solved. Unfortunately, our languages are not that simple. Context, nuance and idioms are abundant and prevalent. There is no simple rule for these things and often these form a case by case basis. This does not even begin to cover the problem of speech translation and recognition, which is whole different animal. This does not even begin to cover the problem of speech translation and recognition, which is whole different animal. Speech is such a varied thing that it is incredibly difficult to differentiate accents and intonation.

Google Translate is a staple for many college students to translate small phrases so they feel intellectual when they order Burritos at an “authentic” restaurant. Skype made huge waves when they introduced real-time translation and there is a myriad of research directed at Translation. Unfortunately; all of these solutions are far from being infallible. Take a look at the vast amounts of videos on youtube where someone translated a Taylor Swift song into Chinese and back. The results are hilarious, but unfortunately demonstrate the weakness of the AI translation. Skype Translate is a known buggy entity that sounds like Siri. The software frequently does not do a great job of translating accurately and being unable to understand where a pause occurs[1]. The most “Ground Breaking” research involving active speech translation research is reportedly occurring at the University of Montreal in collaboration with Google[2], but even its creators acknowledge that a perfect system is far away.

The good news about development in this area is that there is a lot of unexplored territory in this regard. In fact, Google’s “Ground Breaking” research has been to essentially show that SVMs(Support Vector Machines) can be used to aid translators. SVMs known to be inferior to other methods of outlier detection, so it is obvious that this area is not heavily explored. There are so many engineering “tricks” that exist that could make the system better! The crazy thing is that this article was only published last month so it obvious that there are bunch of things that can be done, put in a patent and used to improve Babel.

The fact that translators exist and can be improved with machine learning is good news! The problem is that we(as engineers) need to explore new novel solutions to this challenge and get access to APIs that we can modify and improve.