Meet Meena: A Truly Chatty Chatbot

Synced
Synced
Jan 30 · 4 min read

AI-powered chatbots have been widely adopted by enterprises seeking to streamline their customer service, improve productivity and boost revenue. On e-commerce platforms chatbots can direct customers to recommended products, track orders, explain how print a return shipping label and so on.

Such chatbots however don’t do so well with off-target or tangential talk — for example if asked to comment on the latest art or fashion trends. “Meena,” Google AI’s new generative chatbot, has a thing or two to say about that.

One of a new breed of open-domain chatbots designed to engage in conversations across any topic, Meena’s free and natural conversational abilities are closing the gap on human performance.

Conversations between Meena and humans

Introduced in Google AI’s recent paper Towards a Human-like Open-Domain Chatbot, Meena’s main architecture is a seq2seq model with the Evolved Transformer. It was trained on 341GB of text (40B words) mined and filtered from public domain social media conversations. Compared with OpenAI’s language model GPT-2, Meena is 1.7x bigger in model capacity and was trained on 8x more data.

According to the research team, the best Meena model has 2.6B parameters and achieves a test perplexity of 10.2 based on a vocabulary of 8K BPE subwords.

Interactive SSA vs Perplexity. Each point is a different Meena model version.
Meena Sensibleness and Specificity Average (SSA) compared with humans, Mitsuku, Cleverbot, XiaoIce, and DialoGPT.

To evaluate Meena’s performance, researchers proposed a simple human evaluation metric called Sensibleness and Specificity Average (SSA), which considers two fundamental aspects of humanlike conversation: making sense and being specific. The results suggest that the full version of Meena (with a filtering mechanism and tuned decoding) scores 79 percent SSA, which is a full 23 percent higher in absolute SSA than existing SOTA chatbots such as Mitsuku, Cleverbot, XiaoIce, and DialoGPT.

Meena is also closing in on humans, whose average SSA score is 86 percent. In a surprising finding, the researchers observed a strong correlation between SSA and perplexity — an automatic metric available to any neural seq2seq model. The experiments demonstrated that the better Meena fit its training data, the more sensible and specific its responses became.

Researchers admit that weaknesses remain in their methodology — for example the static evaluation dataset is too restricted to capture all aspects and nuances of human conversation.

In their future studies the researchers will explore broadening the metric for humanlikeness, while continuing to focus on optimization of sensibleness via the optimization of test set perplexity and improving algorithms, architectures, data and compute. They will also consider other attributes such as personality and factuality, with model safety and bias additional key focus areas.

The paper Towards a Human-like Open-Domain Chatbot is on arXiv. Sample conversations with Meena are on GitHub.


Author: Yuqing Li | Editor: Micahel Sarazen


Thinking of contributing to Synced Review? Sharing My Research welcomes scholars to share their own research breakthroughs with global AI enthusiasts.


We know you don’t want to miss any story. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.


Need a comprehensive review of the past, present and future of modern AI research development? Trends of AI Technology Development Report is out!


2018 Fortune Global 500 Public Company AI Adaptivity Report is out!
Purchase a Kindle-formatted report on Amazon.
Apply for Insight Partner Program to get a complimentary full PDF report.

Synced

Written by

Synced

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global

SyncedReview

We produce professional, authoritative, and thought-provoking content relating to artificial intelligence, machine intelligence, emerging technologies and industrial insights.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade