The translation industry’s dirty little secret
Post-editing, in a way, is the dirty little secret of the translation industry. Everybody is doing it, but no one wants to talk about it.
It’s been funny to read in a recent report by Common Sense Advisory* that “machine translation (MT) is one of the most controversial topics faced by LSPs”. I mean, what is so controversial about it?
The truth is that most translators are using machine translation and doing some sort of post-editing. Clients too understand the benefits of post-editing and are developing and deploying solutions based on machine translation systems. Language service providers trail along.
The translation industry demands greater efficiency and lower costs. Post-editing seems to be one of the most efficient means for delivering on such requests. Contrary to what many may be inclined to think, professional translators have been among the first to adopt machine translation and to start doing post-editing.
When you talk to translators in public forums, conferences or schools, they generally complain about machine translation, its low quality and the fact that it corrupts the traditional work of professional translators. Yet, I have the feeling that such views do not fully represent what translators actually do.
At Translated, we develop two tools that provide machine translations to professional translators. One is the MyMemory plugin for SDL Trados Studio: almost 15 thousand translators use it to get matches from the public translation memory and free machine translation for their jobs. The other one is MateCat, a CAT tool which also provides machine translation. It’s used by over 20 thousand professional translators.
Whenever the services are interrupted, even for just a few minutes, we are flooded with support requests. Translators respond quickly to such interruptions. They are so used to translating with the aid of suggestions from machine translation systems that they feel they cannot meet their deadlines without it.
Post-editing services today
Post-editing machine translation is gaining ground as one of the most common way for doing translations. Half of the respondents that participated in the above-mentioned research by Common Sense Advisory provide post-editing services.
The percentage is especially high in Latin America and Asia, where 71% and 70% of the respondents, respectively, offer post-editing among their services. North America (49%), Europe (45%) and Africa (33%) lag behind. Not enough data has been collected for Oceania.
The translation industry as we know it today was born out of the needs of the large IT companies of the 80s. Information Technology as a sector has always led innovation in our industry and this is also evident when looking at the adoption of machine translation.
Software and consumer electronics are the areas where post-editing is prevalent. Other technical sectors also report a strong use of this type of service. Recently, an increase in the usage of post-editing has been registered even in areas traditionally considered too “creative” for MT to be useful: tourism, travel, education, e-learning.
While it’s probably too early to start considering post-editing solutions to translate marketing documents, I wouldn’t disregard it altogether.
Companies large and small offer post-editing services. Large language service providers tend to have a more structured approach based on years of experience with post-editing, technical expertise to run and optimise machine translation systems and technology to allow translators to perform the service effectively.
However, it is often the smaller translation companies that try to find a valuable niche in the translation market by developing solutions based on machine translation.
The technology needed to provide post-editing services is easily accessible. So much so that even translators with no technical expertise whatsoever can use machine translation in their everyday work.
A common misconceptions is that machine translation and post-editing are only suited for large, long-running projects. While it may be true that for large projects the client is often able to provide a specialised MT engine trained on their data, it must also be noted that all barriers to using machine translation even for small and casual translation projects have long disappeared.
Machine translation in CAT tools
Most CAT tools integrate one or multiple machine translation engines. As mentioned above, in SDL Trados Studio you can get machine translation suggestions using our MyMemory plugin, but it’s not the only solution: plugins exist for other MT engines and SDL themselves integrated their MT system in Trados Studio as an extra service.
Cloud-based translation platforms like MateCat, Memsource, SmartCat, Lilt are especially fit for post-editing. On the one hand, they can be easily integrated with online MT systems using server to server connectors. On the other, such tools tend to have a more modern approach and provide a set of additional features to effectively carry out and collect data on post-editing jobs.
MateCat is one such example. Among other things, it integrates multiple machine translation engines so the user can choose the most suitable one depending on language or type of translation.
You can use Microsoft Translator Hub to get a machine translation engine trained on your translation memories. If you translate among closely related languages, you’ll probably get the best results using Apertium from Prompsit. If you specialise in Slavic languages, you can pick Yandex.Translate or Tilde.
Measuring post-editing efficiency
One of the key features that we included in MateCat from day one is the ability to measure how useful machine translation suggestions are for the translators. For every translation job carried out in MateCat, you collect statistical information on post-editing effort and time-to-edit:
- Post-editing effort, the average percentage of word changes applied by the translators on the suggestions provided by the CAT tool.
- Time-to-edit, the average translation speed by the translators.
Analysing that data gives you a good insight into how much machine translation is actually affecting your productivity, positively or negatively. You’ll soon learn what average post-editing effort you can expect for a specific language pair or how much time you’ll spend on your next job.
At Translated, we’ve learnt quite a bit by looking at aggregated data on post-editing effort and time-to-edit for millions of words we translated in MateCat. We can estimate how long it will take to complete a job, when a post-editing job has been completed accurately and what a fair rate for post-editing is.
It is rather interesting to compare our data to the perceptions of other translation practitioners. One of the best, if not the best, news outlet for the translation industry is Slator. They recently published the results of an informal poll they conducted with their readers. One of the question was about the expected human translation speed per hour in 5 years from now. Approximately 20% of the respondents answered 1,300 words, but almost half indicated a more conservative range of up to 700 words per hour.
700 words per hour is actually the norm nowadays in many sectors. There’s no need to wait for 2022. It would be pretty difficult to run a profitable freelancing business with less than 500–700 words per hour. For most common European language pairs, the average rate per word is around €0.05. To make a decent €30–35 per hour, a translator needs to reach (and exceed) the 700 words per hour threshold.
MateCat users get a good feeling of this while translating. At the bottom of the screen they see stats on how much they are translating per hour. That is, how much money they are making.
Stats on time-to-edit is not the most interesting data that we collect from MateCat. Post-editing effort is a much more fascinating performance indicator. We used to collect it even before MateCat, even though it wasn’t as easy as today.
Post-editing effort represents the number of edits translators need to make on any given match from the translation memory (TM) or suggestions from the MT. It is a key indicator used to measure the quality of the aid provided to the translator and, in the end, how useful translation memories and machine translation actually are.
Back in 2008, the average post-editing effort, measured across different language pairs and domains, was around 45%. Nowadays, it has gone down to approximately 27%.
While this progression may seem impressive, it takes on an entirely different light if we consider one more factor: the post-editing effort of professional translators editing 100% matches from translation memories, i.e. human translations, is 11%.
Of course, when editing 100% matches, translators tend to make stylistic improvements to the translation. On the other hand, when editing machine translation, it’s more likely that they are editing actual errors (e.g. terminology issues). However, having an average post-editing effort for MT suggestions of 11% can be seen as the point where we’re getting human-like artificial translations.
Over the past 10 years, post-editing effort has been decreasing by around 2% yearly.
Since the beginning of the century, the most widely used technology for machine translation has been Phrase-Based MT. Over the past couple of years, however, the sector has seen the emergence of two approaches that promise much faster improvements: Adaptive Machine Translation and Neural Machine Translation.
Adaptive MT improves on the phrase-based model thanks to the ability to also adapt on-the-fly to the job being translated. You upload a translation memory to the MT engine and the system adapts immediately to the type of content you are translating. As you go on with the translation, the MT engine learns from your corrections and adapts to your style and terminology.
Neural Machine Translation uses deep learning and neural networks that emulate the workings of the human brain to let the machine learn how to translate effectively. Neural MT promises to revolutionise the field of machine translation and even though it’s been around for just three years (the first scientific papers on the subject have been published in 2014) it is already showing great potential.
What effects these new techniques will have on the post-editing effort is still to be seen. However, it is probable that machine translation will continue progressing and post-editing effort decreasing at a much faster pace than we’ve seen so far.
Game over. Thank you for playing.
Will this mean game over for the translation industry as we know it? Unlikely.
That 11% goal is attainable in what will probably feel like a very short time. It will not, however, impact on all languages and domains. And even if it does, what type of edits will it require? We’re only measuring productivity and not linguistic quality or translation accuracy with this method.
Nevertheless, machine translation is significantly changing the way we do business in the translation industry. Very likely, many companies and freelance translators will go belly up in this brave new world.
However, for those that adapt and master these new technologies there is a world of opportunities out there. New markets, new services, new clients. There’s very little to gain for the luddites. For all the others, these are exciting times to be providing language services.
* Arle R. Lommel and Donald A. DePalma, Post-Editing Goes Mainstream, Common Sense Advisory, June 2016.