Finding yourself as one of two intelligent human beings who don’t happen to share a common language but need to communicate immediately and on the spot, what are your options?
I’d appreciate anyone’s engagement with me on this, but here’s what I’ve got:
- You serendipitously find and recruit a passer-by who speaks both your native Thai and the other’s native Finnish, is willing and able to interpret, and happens to know any necessary topic-specific words for your conversation in each language.
- You stare at each other awkwardly and resort to frustrated hand gestures and try to get through it as fast as possible.
- You speak at each other in your own languages anyway hoping that somehow the other will magically understand if only you speak louder.
- You pull out your phone and cross your fingers that whatever app you have doesn’t threaten them instead of saying good morning and stays true to your intended meaning.
Let me be a linguist for a second and coin¹ this very specific machine translation use-case as the On-the-spot Translation Scenario (OTS). (Also, how beautiful is the doubly-applied initialism here.)
No one to help
These kinds of apps are the product of many decades of research in the field of Machine Translation (MT), a sub-field of natural language processing, and the fact that they ever work is a huge and impressive achievement. Most of the time, these systems need to translate language A into language B and vice versa without any help from humans because there are no humans to help or it would be infeasible to involve humans in the process.
But for an OTS, there are humans standing by who
- possess topic-specific knowledge and
- are presumably motivated to help get their messages across accurately.
When professional human translators take on or are assigned some translation project, there is usually a correlation between the content’s genre and the translator’s areas of expertise. Through one tool, leading researchers in MT have enabled translators to guide the output of a translation system as it makes guesses about target language wording.² Drawing on their expertise, translators in this workflow can produce high quality translations. When monolingual, non-translators were put in this position, they were shown to perform on par with some professional bilinguals if familiar with the topic or domain.³ With many OTS’s, I imagine both parties would be experts in their own domain, not to mention in what they are trying to say. Think two businesspeople who specialize in a niche market.
In an OTS, I for one would be willing to spend a few extra seconds explaining which sense of an ambiguous word I intend if it results in a translation that is more faithful to my actual input. On the receiving side, I would also be happy to guide a system as it crafts translations of my partner’s input using my own knowledge about our conversational domain. Am I alone in this? Think of scenarios with higher stakes than a t-shirt transaction on the street, where you’d likely want really accurate translation (such as interacting with a law enforcement official in a nation other than your own).
No way to help
My guess is that everyday people in On-the-spot Translation Scenarios would willingly take on the extra costs of one or two more clicks/taps per dialogue turn if it ensured accurate representations of their messages. But, current systems don’t allow this kind of user involvement and I think we should craft ways that enable it.
Imagine if we could interact with the machine in the middle to help it do a better job. What if we thought about the OTS as a three-party, collaborative event where both humans + the MT engine work together to leverage their distinct skillsets to achieve real communication and flow of information in real time. Would you use this kind of a system? What if each interaction was optional? What if you saw a confidence-in-translation bar growing with each annotation you make?
If you can’t tell already, I am very excited about this — about crafting methodologies which let everyday people interact with MT. In reality, this would look like a patchwork of existing one-off technologies combined with novel ways of bringing the human into the loop. I’ll be writing more about these ideas, especially in the way of discussing these specific technologies, their effect on translation output quality against a number of metrics, and their potential implications for user trust in automatic translation.
I am writing to a general audience here and so naturally I’ve simplified things, but this topic is something that I study at every level.⁴ So, if you want to nerd out with me about the technical details — please get in touch.
¹ As far as I can tell, this scenario doesn’t already have a term to describe it specifically. Speech-to-speech translation is a term used widely to refer to a technology that targets this scenario, but I am saying that this use-case itself merits a label.
³ Koehn, Philipp. “Enabling Monolingual Translators: Post-Editing vs. Options.” In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 537–545. Los Angeles, California: Association for Computational Linguistics, 2010. http://www.aclweb.org/anthology/N10-1078.
⁴ I’m especially interested in the potential the OTS presents for incremental/online learning, where behind-the-scenes machine learning could enable the minimal amount of user prompting in a clean UI.