What is lost in translation?
“Comment et tu faisant?” — The direct word for word translation from English to French of ‘How are you doing?’. Say this in France and not a soul will understand you, for the simple fact that literal word by word translation just does not work. If you’ve ever been laughed, scorned or just ignored in another country, after translating word for word into that country’s own language, you are not alone. For many adolescents with a couple of year’s schooling in a language, this is a very common approach to translation and unfortunately mostly fruitless. Good translation requires certain jumps of the imagination, so that the translated text contains the flavour or the overall meaning of the original text without having to try to match exactly its syntax, or even its vocabulary. The longer the text, the more figurative the language, the harder this problem becomes, to the extent that translation of novels is nearly as great an art as the writing of the novel itself. Thankfully, machines are getting better at precisely this art everyday. In this article I will examine the nature of taking whole sentence meanings and how translation machines are translating with this information.
State of the Art
If you’ve been on a trip more recently and you have used google translate, you might have noticed that google translate doesn’t translate literally, in fact sometimes the translations can start to seem very far removed from the original [EXAMPLE]. The arrival at the contemporary approach to machine translation is much the same as it is in many other machine learning task. Previously it was thought that the best approach would be to come up with a machine that benefitted from a highly detailed description of the two languages’ grammar and a codification of all possible linguistic rules for them. However not only was this likely to result in the awkward literal translations as above, but also produces mapping problems, i.e. how do we map the behaviours of the grammar in one language to the language of the other. Then there’s the problem that trying to describe the rules that form meaningful sentences is itself a bottomless pit, there are always new phenomena to explain. So instead of codification, empirical methods have taken prominence; we pass model pairs of translated phrases from a bilingual phrase corpora, train the model on a certain amount of them and then test its validity on the rest of them. This will be a process familiar to anyone who has done any machine learning, or even certain regression analyses. Google’s current incarnation of its machine translation is what they call “Neural Machine Translation”, in which the machine is trained on these pairs of sentences and, rather than trying to understand each word on their own, tries to extrapolate the overall semantic value of the sentence and find something with a comparable value in the target language.
What do you mean?
This to my mind raises two important questions about the information the model is trained upon:
- Who decides what a phrase means?
- How far abstracted can the meaning of the whole phrase get from the meaning of its individual parts?
The first question has a simple answer, there are already plenty of corpora of the necessary pairs of phrases in an original language and their target languages. Most of the legwork behind these pairings is crowdsourced and ollabortaive; for example the Tatoeba project, in which speakers of one language are asked to give translations of sentences in other languages, or evaluate translations that already exist. What is important to note about the project is that the sets of phrases are not necessarily pairs, there can be many equivalents for one sentence within one language and one phrase may hace many distinct translations in another language. The Tatoeba project currentyl has over seven million translations, which is certainly much more than can be done by a couple of linguists with a clipboard. Google has their own equivalent to this which is strangely like duolingo without any of the actual learning. With only a knowledge of already well documented languages like French and Spanish, the Google translate community app will only let me translate new phrases rather than evaluate existing ones, and some of the translations that I’m presented with are very tricky indeed. Take the following as an example:

In Spanish, when it comes to body parts amongst other things, possessive pronouns are not used, instead the definite article is. So instead of your heart in your hands the Spanish equivalent translates as the heart in the hands. This presents the problem that out of context, we don’t know whose heart and whose hands. It could be with their heart in your hands or with my heart in my hands. My Spanish friend Carlos agreed that this was incredibly ambigious. Clearly there are multiple translations of the above phrase, but then how do we go about deciding which contexts make these translations relavent, and who gets to decide what is relavent? One limitation of such crowdsourcing is that it is a very specific type of person who produces and evaluates the translations; namely an internet literate, multilingual person with the time and enthusiasm to contribute to projects like this. This undoubtedly has an effect on the types of translations that make it into the corpus upon which Google’s tower of Babel is built, whether the language is more formal, gives less weight to dialectic interpretations or otherwise. Ostensibly the routine of translating and then validating translations circumvents this, but then again it’s the same people that translate and validate. When it comes to ambiguity how do we know we’re not all thinking the same thing?
Ambiguity is not only the preserve of syntactic quirks like Spanish’s use of the definite article. Metaphor and figurative language which we all use everyday present particular challenges when construcing a corpus. This brings us to the second question. The same example above is also a metaphor, and I could have just as easily translated the semantic content of the metaphor instead with something like being emotionally vulnerable, or even gone so far as to find an equivalent metaphor in English, such as with one’s heart on one’s sleeve (I realise that even these two translations differ slightly in their meaning). How far abstracted (or not) do translations have to be from the original language in order for the translations to be valid in a given language? This is a question that will dog linguists but thankfully does not have to dog the folks at Google. Google’s latest machine is revolutionary; whereas previously it was thought that a middleman language would have to be used between two languages (no surprises it has always been English) now there is no middleman, the machine has its own representation for meaning between two languages, presumably a massive series of embeddings and word vectors. Google Translate has its own language. Perhaps this does not sound very impressive, but consider that you can get sensitive translations between Japanese and Korean on Google Translate, and these languages have not been trained in such pairs. Instead the abstracted layer of meaning from translating these two languages into other languages like English has been sufficient to translate directly from Japanese straight into Korean. This is incredibly impressive when one considers how much less data one will have to gather on languages that are harder to translate or are even dying out. Is this without its set backs? No. For example it seems as though Translate has trouble deciding what to do with idioms, some of them get translated, but others don’t; I still don’t know how to say “Never look a gift horse in the mouth.” in Spanish. One thing is for sure, Google Translate is a gift horse.
