Benchmarking Natural Language Understanding Systems: Google, Facebook, Microsoft, Amazon, and Snips

Alice Coucke
Jun 2, 2017 · 6 min read
End-to-end pipeline for voice interfaces.

An open-source benchmark

7 intents covered in this benchmark.

It’s all about Filling slots

{
"intent": "GetWeather",
"slots":[
"datetime": "2017–06–04T14:00:00–05:00",
"location": "Coney Island"
]
}

How are Google, Microsoft, Facebook, Amazon, and Snips doing at this problem?

F1-score (per intent, and averaged) for the different providers trained on 70 queries / intent.

A truly reliable AI

F1-score (per intent, and averaged) for the Snips NLU engine trained on 2000 queries / intent.

Snips Blog

This publication features the articles written by the Snips team, fellows, and friends. Snips started as an AI lab in 2013, and now builds Private-by-Design, decentralized, open source voice assistants.

Alice Coucke

Written by

senior ML scientist @snips - #PhD in statistical physics from @ENS_ulm

Snips Blog

This publication features the articles written by the Snips team, fellows, and friends. Snips started as an AI lab in 2013, and now builds Private-by-Design, decentralized, open source voice assistants.