New Real-Time Translation Algorithm that Strikes Perfect Balance Between Delay and Quality

Published in

ETRI Journal

4 min readNov 16, 2021

New model for simultaneous neural machine translation that outperforms existing ones

Real-time translation algorithms are challenging to make because they must learn to correctly segment incomplete sentences to produce partial translations. In a recent study, ETRI scientists proposed an innovative reinforced learning-based model with an attention mechanism that jointly learns how and when to translate incoming words. Their approach, which can strike any desired balance between translation quality and delay, will pave the way to better neural machine translation systems for real-time tasks.

Globalization and the technological developments that enabled it have radically changed the world over the past few decades, allowing exchanges and interactions between people of disparate cultures. However, a truly deep understanding between people is only possible when language barriers are overcome. Fortunately, machine learning, in the form of neural machine translation (NMT) systems, can help automate the arduous task of translation to take down these barriers.

Most NMT systems are designed to operate offline; that is, they can only translate complete sentences. Their inherent delay makes them unsuitable for real-time translation tasks, in which the source speech has to be translated on the fly and partial translations need to be produced for yet-unfinished sentences. The main challenge to real-time (simultaneous) NMT algorithms is learning how to properly segment the incoming partial sentence and “align” these pieces with tentative translated segments of the output partial sentence in the target language.

At the Electronics and Telecommunications Research Institute (ETRI), Korea, a team of researchers has been actively working on this challenge. In their latest study, which was published in ETRI Journal, the team reported a novel real-time translation algorithm that, unlike existing ones, can jointly learn how and when to translate incoming words so as to strike a good balance between translation quality and latency.

Their model is based on the concept of reinforced learning and adopts an attention mechanism with two novel additions. The first is a “segment” module, which dynamically detects the boundary positions of input words by considering the partially detected current input word, the previous target word, and the previous boundary position. This allows the model to parse the incoming sentence into the elements that will be translated to the target language. A second module — the “alignment” module — computes the correspondence between the input and output words.

Another important part of the algorithm’s design is its reward function, which plays a part in the learning process by shaping the model’s parameters to obtain a desired response. In this case, the reward function has a discount factor that lowers the score according to the delay, or how much it took the model to provide a translation. “When we set a large weight for the delay discount factor, the model tends to translate quickly even if the translation could be rather inaccurate. This configuration allows the segment module to control the tradeoff between translation quality and latency, adding versatility to our approach,” explains Dr. YoHan Lee, lead author of the study.

Experimental results showed that the researchers’ novel approach achieved higher translation quality than other simultaneous NMT models. Surprisingly, the model also achieved lower latency when configured for offline translation by setting the delay factor to zero. “Even when not considering the latency, it seems our model can find a proper segment point as soon as partial input words are sufficient to generate a target word,” explains Lee.

The results of this study pave the way to better simultaneous NMT implementations, which are useful in various situations where real-time translation is needed (such as video conference systems, streamed media, and in-ear interpreter devices). Let us hope machine translation keeps evolving so that we can all effortlessly overcome language barriers.

Reference

Title of original paper: Simultaneous Neural Machine Translation With a Reinforced Attention Mechanism

DOI: 10.4218/etrij.2020-0358

Name of authors: YoHan Lee, JongHun Shin, and YoungKil Kim

Affiliation: Language Intelligence Research Section, Electronics and Telecommunications Research Institute

About Dr. YoHan Lee

YoHan Lee has been with the Electronics and Telecommunications Research Institute (ETRI) in Daejeon, Korea, since 2017. Before joining ETRI, he received BS and MS degrees in electrical engineering from Korea University, Seoul, in 2015 and 2017, respectively. He is currently a researcher at the language intelligence research section at ETRI. His group has developed machine translation systems for real-time speech translation and document translation. His group has also developed dialogue systems for information services and language tutoring.

Media contact:

carep@etri.re.kr (YoHan Lee)

New Real-Time Translation Algorithm that Strikes Perfect Balance Between Delay and Quality

Written by ETRI Journal Editorial Office