Can my computer understand me?

Nendou Riki
5 min readJul 9, 2019

--

What is machine translation?

Merriam Webster: “automatic translation from one language to another”

Machine Translation in Ancient Times

How was machine translation done in a time before machines where created? Obviously, it was not. There were interpreters who dealt with spoken language, and translators who did written translations. Translators and interpreters could often speak many languages, but what happened when one of them met a person who did not speak any of the same languages as them?

Well, not really much of anything.

Communication with much of the world was impossible, which lead to many conflicts due to this lack of communication.

Today, around 20% of the people are bilingual. But what happens when they, or someone who only can speak one language meets a person who cannot speak any of those languages? Nowadays, things are much easier. We have software and programs that allow us to communicate via speak or writing with each other.

Man vs. Machine

Human interpreters and translators have gained thousands of hours of experience in the languages that they have perfected. Due to their familiarity and human understanding of something that is innately human, they are able to make precise and accurate translations between two or more parties. Machine translation, although it has come a long way, still often makes many critical errors. However, there are many advantages to machine translation. A computer can learn how to translate from Swahili into Quechua, even if there is no existing translation between the languages available between them.

Tokens

The most primitive machine translation tools translated sentences based off of tokens. In the English language, a token is an individual word, whereas in Chinese, a token is an individual character, etc. The translator would receive each character one by one and spit out the other language’s version of that word, without accounting for grammar or any other complications.

Example:

我去年跟我的朋友一起去了中国 (Mandarin Chinese)

The token for token translation doesn’t make much sense.

“I go year with I of friends together went (past tense token) middle kingdom.”

That’s a load of nonsense, right?

A regular translation would be:

“Last year I went to China together with my friend”

It should make sense why this method isn’t great. Many times, the words that we say derive their meaning from others, based off of humor, emotion, references to earlier parts of the conversation, and complex grammar structures. For example, how would a computer deal with a marker, 了, that exists in Chinese to denote a part of our completed action, when in English, we make our entire verbs past tense. How would Chinese deal with connecting words like “that” that have direct translation in Chinese? A better system had to be created, so we started creating other methods.

Evolution of the Technology

After a long time, the method changed. The translation moved towards an intermediate language mode, using vectors. A vector is a list of numbers that a computer can use to gain some amount of meaning from the inputted language.

Think of it as a special language that only computers can understand. Basically, the computer will use a lot of data that it has gathered from the language in various corpora and other sources, and will translate it into the computer speech. From there, the vector will be translated into the other language, needing no data connecting the two languages.

Think of it like a radio. A certain song is created and turned into a sound file. Then, said song will be taken by a radio station and transmitted as a frequency. Following that, a radio will receive the frequency and audio can be created again.

This form of translation works very well, but there are still mistakes. Complex grammar structures and context are still difficult to interpret. So what can we do?

The Future

Machine learning is the future for translation as well as for many other parts of computer science. Machine learning, or deep learning relies on a very large amount of data to be entered, and for a lot of feedback and parameters to be relayed.

For example, a bot was created that could play a video game called Defense of the Ancients. There are many objectives in this multiplayer game, but the most important ones are killing enemy players and destroying their base. A powerful computer was able to run thousands of simulations simultaneously, and the computers started by entering random commands. The programmers asked the bots to produce the best outputs in the form of points (received by killing players and winning the overall game), and the bot continued to be more efficient at completing these objectives. Now, one of these DOTA bots can defeat even the pro players. The computer itself isn’t thinking and controlling their on-screen character like a human would, but they know which command sequences are the most efficient to achieve a victory.

The deep learning method for translation works essentially the same, whereas the way that the simulation gets “points” is by doing well written translations. The feedback can be given by translators, or by comparisons to corpus data. This is really the best way for machine translation to continue to evolve.

If you think about it, the reason that we are able to speak our native language is because of repetition and a large variety of scenarios that we have learned to interpret. When we learn a new language, we must hear hundreds and thousands of conversations in order to figure out how to have good grammar.

Our brains are essentially computers in that way, albeit much more powerful ones.

It is fitting then, that the programs used to gather data are called neural networks.

Some day, machines will be capable of giving near perfect translations from any language into any other, but the human interpreter will always be needed to give the translation the warmth and personality that only a living, breathing being can provide.

--

--