Machine Translation Redefined with DeMT™: No-Human-in-the-Loop

TAUS
TAUS — the language data network
3 min readApr 12, 2022

In a global environment where cyberspace is treated as an extension of the public realm and where citizens are now netizens, the abundance of content grows to be overwhelming. The internet makes the world’s information readily accessible but unless one is multilingual in all languages most of that information has to go through translation to be understood. This consequently means that in agile, high-stakes business environments, much effort goes into making sense of foreign language information with very little time available. In many instances, the sheer volume of information deems human translation unfeasible or not cost-effective. Free machine translation tools on the internet seem to be the first place people turn to overcome this problem. However, such tools come with their own problems: low-quality translations, particularly for specialized domains, and the risk of exposing sensitive non-public data.

As an alternative to free MT tools, you could also turn to professionals who build specific MT engines tailored to the content you’d like to access in the language you understand. That too comes with its own challenges: access to sufficient amounts of MT training data particularly in the defined narrow domain, selecting the right algorithm or MT engine, employing internal MT teams, or outsourcing the service to manage the process and evaluate the outcome, therefore, becoming expensive and far from real-time.

Language industry visionary and CEO of TAUS, Jaap van der Meer, explains in his article Data-Enhanced Machine Translation: “The post-editing model, or ‘human-in-the-loop’ as it is more neatly referred to, is in a way a half-baked solution, a compromise. These compromises stop us from seeing the bigger picture and bigger opportunities,” and asks “What if these traditions and principles no longer count under the new economics?”

The answer to this question is DeMT™ (Data-Enhanced Machine Translation), a premium quality level of real-time translation where the training of the selected MT engine is done with a few clicks using the readily-available domain-specific datasets. And the expected quality improvement scores (Understanding BLEU Scores) are shared beforehand to eliminate the human evaluation step.

Here is what contemporary, no-human-in-the-loop machine translation looks like: users upload their content file to be translated, select the corresponding datasets to be used in the real-time improvement of the chosen MT engine, and receive back their translated content in the language that they have chosen right in their inbox. This process is called DeMT™, developed by TAUS and enhanced by carefully curated and evaluated TAUS datasets using the engines provided by the most reputable MT providers. In case you cannot find the datasets corresponding to the narrow domain or the language pair you are looking for, TAUS can generate tailored training datasets through its global communities of data contributors.

DeMT™ allows everyone to access premium quality, trained with domain-specific data MT outputs in real-time with a few simple clicks without the hassle of outsourcing, concerns for privacy, and the challenge of accessing or processing training data and training an MT engine. According to Forbes, 80% of the AI developer’s time is spent on data preparation. With DeMT™, users don’t have to think about cleaning, anonymizing, domain classification, or bias management when it comes to data.

DeMT™ simply redefines what machine translation stands for by completely eliminating the human-in-the-loop, fully automating the process, and removing all fears of not knowing how to deal with data.

Start exploring DeMT™ now and contact us for a custom or scaled-up solution.

--

--