A Guide to Collaborative Translation Workflows

There are many reasons why one may want to translate a document from one language into another, and I have yet to find a tool to work with translations and reviews trivially. There are some tools meant for i18n that are not really suitable for translating much more than menus and user interfaces. And recently at OpenCon2017, Felix Nartey introduced me to the Translate extension for MediaWiki: check out this screencast that both demonstrates how the extension works for Wikipedia entries and gives excellent advice for how to work with translations in general. But to use the extension for translating your own work, you would have to host and maintain a MediaWiki instance, especially in order to collaborate with others.

How then should you go about with translating a document or — more importantly — requesting someone to translate your work into their language?

Before the MediaWiki solution was available, Amanda Alvarez and I had come up with a similar translation workflow in 2015 that included an approval mechanism with the possibility to review translations and comment on them. We used Google Docs! I showed some friends at OpenCon2017 a modified version of this workflow and people seemed to like it. A few days ago, Naomi Penfold and I were talking about what I discussed at OpenCon and she suggested I document it more thoroughly. So here it is! I appreciate any comments or feedback you may have.

Overview of the steps involved

  • Paste your original text in a new Google Docs file, and choose Tools followed by Translate document… to generate a machine-translated copy of the text in the target language.
  • For each paragraph (or heading or bullet point) of the translated text, add a comment with the original text from the source language. The shortcut for adding comments in Google Docs is Command+Option+m on macOS or Ctrl+Alt+m on Linux or Windows.
[Note: You might start seeing this GIF mid-way through the animation.] Start with your original text in Google Docs and machine-translate it from the source language to the target language. Then, take the original text and add each paragraph as a comment to the relevant paragraph of the machine-translated text.
  • Change from Editing mode to Suggesting mode and tweak the machine-translated text. If you open your document to comments, anyone with a link can submit suggestions. [Note that opening your document to the public means anyone can mark comments as resolved.]
  • Discuss the translation for each paragraph in the comments for that paragraph.
Anyone with comment rights can propose suggestions to the machine-translated text.
  • “Resolve” the comment when you’re satisfied that the paragraph is ready. Ideally, two native speakers of the language will agree on the text before marking the comment as resolved.
  • Re-open the comment if further discussion is needed.
  • Once all comments are resolved, the document can be considered translated!

How to use this workflow

Translation is a difficult and time-consuming process, so why not reduce the burden by taking advantage of machine-translation tools like Google Translate? Now, Wikipedians note the problems with such an approach, stating that “an unedited machine translation, left as a Wikipedia article, is worse than nothing”. But it serves as a starting point that you can tweak and improve, and with this workflow we aren’t leaving the machine-translated text as is. Of course, Google Translate will spit out hideous and/or hilarious text for some languages, so it’s not always best to go with it.

Using a machine-translated text as a draft is particularly helpful if you’re requesting someone to translate a document into a language they speak: doing so word-by-word from scratch can be tedious! Do the grunt work yourself and prepare the document (with appropriate comments) for them to work with.

Things to remember

  • One document per target language. Try and avoid using a single document to handle multiple translations. It’s best to create a separate document for each target language; in fact, if you use the option native to Google Docs, it’ll give you a separate file anyway.
  • A boilerplate / front matter at the start is helpful. Use this to list objectives of the translation, the style guide and conventions to follow (e.g. British English vs American English), an FAQ, and a list of approved translators/reviewers.
  • Identify approved reviewers. And give them full edit rights to the document. This is particularly useful if you open comments/edits to to anyone with a link — you know who can veto comment resolutions etc.
  • One comment per paragraph. Or heading or bullet point. To me, it makes most sense to treat each paragraph as an atomic unit and capture the essence it contains when translating it, rather than working on sentence-by-sentence translations.
  • Switch to Suggesting mode from Editing mode in Google Docs when modifying the target-language text. Do this even if you have full edit rights to the document. It helps to track the changes as you go along.
  • Optional: combine this workflow with GitHub issues and project boards. Use these tools to assign reviewers, track progress and integrate/deploy/share the final translations. You can also share your document in multiple languages using GitBook.

Some suggested use cases

  • Textbooks, guides and tutorials: Use this workflow to translate (with permissions, where needed!) documentation to your native language. Be careful when translating technical documents as in some instances you may need to retain the word in the original language. It might help to maintain a list of such words for find-and-replace convenience.
  • Poetry or fiction: This is trickier as admittedly automated translations will be particularly terrible here. You might want to skip the machine-translation step and translate each line manually. You can still use the rest of the workflow to review translations and improve them. Of course, check that the original works are openly licensed or that you have the rights to translate them.
  • Subtitles: Make your videos more accessible by soliciting translations to several languages. This can be useful for videos on YouTube as well as for standalone video files in the .mkv format. Make sure you provide clear guidelines for how to split subtitles by timestamps.

What do you think of this workflow? How can we improve it? Where do you think it might fail? Please leave comments below or ping me on Twitter.

Bonus track: a LaTeX template…

…for reviewing translations using PDFs, for those who prefer working offline with pen and paper. Ideally, though, you would only do so for reviewing the (near-)final proof. Here’s how I would do it:

The left column has the text in the source language, while the right column has the target language with line numbers. The wider margin next to the target language is suitable for scribbling notes. And you can relay comments by referencing the page and line numbers.

Here’s the LaTeX template I prepared (well, it’s really a Markdown file with YAML front matter and embedded LaTeX) to generate the above PDF, which I think is suitable for print-based review of translations:

To generate a PDF from your Markdown file, simply install pandoc and run:

$ pandoc -o translation-review.pdf translation-review.md

I’m no LaTeX expert, so if you have suggestions for how to improve this template, I would love to hear from you. You can leave comments below the GitHub Gist as well, if you prefer.

Thanks for reading!