Neural Conversational Models

Hiroki Imabayashi

Published in

Academication

5 min readFeb 25, 2017

【一言まとめ】

特徴量チューニングや、対話のルールを生成することをせずに、単純に対話の前後ペアから学習するモデル構築手法を提案。

【著者】
Oriol Vinyals, Quoc Le,

【所属機関】

google

【URL】

https://arxiv.org/abs/1506.05869

【Abstract】

Conversational modeling is an important task in natural language understanding and machine intelligence. Although previous approaches exist, they are often restricted to specific domains (e.g., booking an airline ticket) and require hand-crafted rules. In this paper, we present a simple approach for this task which uses the recently proposed sequence to sequence framework. Our model converses by predicting the next sentence given the previous sentence or sentences in a conversation. The strength of our model is that it can be trained end-to-end and thus requires much fewer hand-crafted rules. We find that this straightforward model can generate simple conversations given a large conversational training dataset. Our preliminary results suggest that, despite optimizing the wrong objective function, the model is able to converse well. It is able extract knowledge from both a domain specific dataset, and from a large, noisy, and general domain dataset of movie subtitles. On a domain-specific IT helpdesk dataset, the model can find a solution to a technical problem via conversations. On a noisy open-domain movie transcript dataset, the model can perform simple forms of common sense reasoning. As expected, we also find that the lack of consistency is a common failure mode of our model.

【Abstract翻訳】

【どんなもの？】

対話問題に対して、sequence to sequence(seq2seq: 任意長シーケンス入力から任意長シーケンス出力を行う)を応用した。

「ある発言A」と「Aに対して相手がした発言B」、「B」と「Bに対して相手がした発言C」…というように、発言と応答のペアで学習。

【先行研究と比べてどこがすごい？】

先行研究である「任意長のシーケンスを入力として、任意長のシーケンスを出力するDNN（Sutskever, Vinyals, and Le, “Sequence to Sequence Learning with Neural Networks”, NIPS2014. ）」では、従来DNNで実現できていなかった「任意長シーケンスの入出力」を実現し、機械翻訳で最も高い精度をだした。今回は、人力で対話ルールや特徴量を作りこむことをせずに、対話モデルを構築した。

【技術や手法のキモはどこ？】

発言と応答のペアで学習させることで、対話ルールや特徴量をデザインせず対話モデルを構築した。

【どうやって有効だと検証した？】

データセットは２つ
①IT trouble shootingコーパス（一度の対話は大体400単語、30Mの対？。
②Open Subtitleコーパス（訓練用：62Mの発話文、テスト用：26Mの発話文）

実際に応答を生成して検証いる。

どんな単語を選択（発話）するかをperplexityで評価している。n-gramモデルと比較し、perplexityが下がっていることから、良いという定量評価。

【議論はある？】

矛盾した回答をしている生成文がある（以下）。

人間：what is your job?
マシン: i’m a lawyer.
人間: what do you do?
マシン: I’m a doctor

良いところは、データセットに対する汎用性。まだまだなところは、回答の精度、文脈判断。ただ、この研究をベースにある程度のルールを設定したりなどチューニングすると、もう少し違和感の少ないモデルが構築できそう。

【次に読むべき論文は？】

【手法詳細】

【関連リンク】

だれかの実装（Github）：https://github.com/Conchylicultor/DeepQA

Neural Conversational Models

Written by Hiroki Imabayashi