Paper Review 2: The Octopus Approach to the Alexa Competition: A Deep Ensemble-based Socialbot

Fatih Cagatay Akyon
NLP Chatbot Survey
Published in
2 min readOct 18, 2018

In this post, the paper “The Octopus Approach to the Alexa Competition: A Deep Ensemble-based Socialbot” is summarized.

Link to paper: http://alexaprize.s3.amazonaws.com/2017/technical-article/mila.pdf

Serban, Iulian V., et al. , 2017,“The Octopus Approach to the Alexa Competition: A Deep Ensemble-based Socialbot.” Alexa Prize Proceedings

MILABOT wins 2nd prize at NIPS. (Retrieved from https://twitter.com/dendisuhubdy)

In this paper, MILABOT is presented. As a deep reinforcement learning chatbot, it mostly relies on deep learning techniques to understand and generate responses. Reached to the semi-finals for Amazon Alexa Prize 2017 competition, its system also contains natural language generation and retrieval models, template-based models and some rule-based parts, apart from neural networks that constitute the main work done.

Authors highlight the 3.15 rating of the chatbot while comparing it with other ones, which average 2.92 out of 5 and its higher number of turns per conversation point the capability of generalization for better results. The system of chatbots evolved throughout the centuries, and authors name Eliza and its successors to explain that. To overcome the complexity of human language, they say their core approach is built on machine learning. Apart from few rules, system components are optimized by machine learning in MILABOT. This mechanism also benefits from every conversation it’s involved. 22 response models generate candidate responses in this process using dialogue history and other factors, and according to fact that whether it has a priority response or not, system returns a different response. MILABOT also have a model selection policy too, using stochastic, supervised etc. varying methods. After this phase A/B Testing Experiments, which are used to compare two versions of a model and sees which one is better, is done on MILABOT. Different experiments yielded different results, however Q-learning AMT policy performed best and reached this 3.15 rating. In conclusion, authors say that they developed new set of deep learning models for NLP, boosting the success rates. Claiming that their system is more interactive and engaging, they explain their contribution to the field in this manner. Overall, this recent paper stands as another addition to the literature.

--

--