Yandex Translate improved quality (February 2018)

Konstantin Savenkov
Intento
Published in
1 min readFeb 19, 2018

In September 2017, Yandex announced their new hybrid engine for Machine Translation. Since then, they were gradually deploying new models for increasingly more language pairs on their web interface and for selected API partners. As of February 2017, it is available for eight language pairs, including four we have in our benchmark (en-ru, ru-en, en-tr and tr-en).

We have evaluated the new engine on those four pairs from the WMT17 dataset and found an amazing 5–10% improvement, which feels like a big leap forward for such complex languages as Russian and Turkish. For three of the pairs we have evaluated, Yandex is currently the best Cloud MT out there. With its moderate $15 per million characters, it’s likely to be a good option to consider for all pairs supported by the new engine.

We should note that en-ru and en-tr datasets for the WMT17 corpus we use in our benchmark were provided by Yandex, hence there may be some bias. However, our evaluations on customer en-ru-en datasets also indicate the improved quality. Consider this when optimizing your MT vendor portfolio.

--

--