Mr. Translator
Published in

Mr. Translator

What are the approaches of machine translation?

Machine translation methods can be divided into rule-based machine translation (rule-hased machine translation, RBt) and corpus-based machine translation (corpus:-based machine transla-tion,CBMt). The latter is based on different modeling methods, and can be subdivided into case-based machine translation (example-based machine translation, BMT), statistical machine translation (statistical machine translation, SM)) and neural machine translation (neural machine translation, NMT).
Below, I will introduce the development background and main technical features of each method.

0. Summary of various machine translation approaches

The rule-based method is simple and easy to implement, but it completely relies on manual summary of linguistic knowledge and faces the difficult problem of knowledge acquisition.
While the corpus-based method automatically learns translation knowledge from bilingual control texts, the cost of developing a new language pair is much lower than that of the rule-based method.
The case-based method is the same as the rule-based method, which uses a large granularity of knowledge and is unable to obtain context-sensitive constraints, so it is easy to copy the translation results mechanically. the performance is obviously lower than that of statistical machine translation and neural machine translation.
From the point of view of the maintainability and expansibility of the system, it is difficult for the rule-based system to use human knowledge, while the corpus-based method is data-driven and easy to maintain and expand.
However, if bilingual corpus resources are few or difficult to obtain, the corpus-based approach will become ineffective, while the rule-based approach is much better.

In addition to neural machine translation, other methods decompose the translation process into many sub-processes, including the selection of translation granularity, the matching and selection of translation transformation rules, the adjustment of word order in translation, the decoding process in search space and so on.
Some sub-processes are in the upstream stage of the translation process and cannot be optimized directly according to the ultimate goal.
In addition, these methods include that statistical methods can only use local context information in the process of translation, such as vocabulary, syntax and other information within the basic translation unit, which ultimately leads to the lack of a global view of the translation and affects the fluency of the translation.
The common feature of these methods is that the translation process is a white-box operation, which can individually control, update and optimize each sub-process or sub-module, and it is also easy to integrate external knowledge (such as dictionary resources, syntax, semantic structure, knowledge base, etc.) into the translation process to improve the quality of translation.
The method based on neural network makes the translation process a black box operation, which is difficult to pre-internal the details of the sub-process from the external F, and realizes the overall direct optimization from the source language side to the target language side. it reduces the problem of information loss and dislocation caused by the convergence between sub-processes.
In addition, it makes better use of the global context in the process of translation, including all the source language information and part of the target translation information, which makes the translation more fluent.
The method based on neural network requires a high amount of computation, which involves a large number of floating-point numerical calculations and multi-dimensional vector-based calculations. it is usually necessary to use graphics processor (GPU) or machine learning chips (such as TPU developed by Google) to complete model training.
With the popularity of the Internet and the rapid development of hardware technology, the scale of corpus data collection and machine computing ability have been improved.
The research and development of machine translation system has developed from rule-based method to corpus-based method, and from statistical machine translation method to neural machine translation method.
At present, several major online multilingual translation systems have adopted neural machine ketone translation methods, such as Microsoft translation, Google translation and Baidu translation.

1. Rule-based machine translation method.

Since the problem of machine translation was formally proposed in the 1950s to the 1990s, the mainstream method of machine translation has always been a rule-based method.
The rule-based machine translation system relies on compiling bilingual dictionaries manually, and experts summarize translation transformation rules at various levels to form a translation knowledge base.
In the process of translation, the computer uses dictionaries and translation rules to decode the input source language sentences and translate the source language sentences into target language sentences.
The general process of rule-based machine translation can be divided into three stages: analysis, transformation and generation.
The analysis phase completes the parsing of the source language sentence, and parses the source language sentence into a tree structure representation by means of word segmentation, part of speech analysis, lexical analysis and syntactic rule analysis.

In the transformation stage, the tree structure representation of the source language sentence is transformed into the tree structure representation of the target language by means of word and phrase insertion, deletion, part of speech representation mapping and sentence composition ordering.

Specifically, the prepositional phrases and adverbs are transformed from the left subtree to the right subtree respectively, the auxiliary “的” is deleted, and all the part-of-speech representations of the source language are correspondingly transformed into the part-of-speech representation of the target language.
In the final generation stage, it mainly completes the work from the tree structure representation of the target language to the generation of sentences in the target language. in this process, the source information in the tree structure is transformed into the target information according to the translation knowledge of the dictionary and other languages. it is necessary to adjust the tense and voice according to the grammatical knowledge of the target language, so as to make the translation conform to the expression habits of the target language.

Rule-based machine translation methods take small-scale data or linguists’ subjective language sense as the source of machine translation knowledge.
Its advantage is that it does not rely on large-scale corpus and can quickly establish a translation system for resource-poor language pairs.
In addition, there are usually fewer dictionaries and rules, and there are low requirements for computer performance configuration.
The disadvantage of rule-based machine translation is that the granularity of rule description is large, which leads to the rigidity, rigidity and low quality of sentence translation.
Because the collection of bilingual dictionaries and the summary of translation rules is a complex process, the quality and scale of the rules depend on the knowledge and experience of linguists, and the labor cost is expensive.
In addition, the coverage of translation rules is low, and it is expensive to maintain the rule base.
Due to the ambiguity of the language itself, after a certain number of translation rules have been accumulated, it is difficult for the manually compiled translation rules to correctly deal with the ambiguity of the rules, and there will be conflicts and compatibility between the new rules and the existing rules.

With the development of computer technology, there are a large number of text corpora which are helpful to translation, which are valuable resources for machine translation research.
However, it is difficult for rule-based systems to make effective use of these resources.
If the system can automatically learn translation knowledge from previous translation examples, it will bring great benefits to automatically update the system, improve the robustness and translation accuracy of the system.
Therefore, translation methods based on large-scale real texts are gradually emerging and replacing the rule-based system.
However, the rule-based translation method is not useless, it can be introduced to make up for the shortcomings of other machine translation methods, in addition to being used in resource-poor language pair translation systems, it can also solve the translation problems of low-frequency words and special sentence patterns.

2. example-based machine translation method.

The example-based machine translation method was proposed by Professor Makoto Nagao, a Japanese scholar, in the 1980s ((Nagao, 1984).
In this method, firstly, the translation knowledge base or case base is automatically constructed by bilingual control text, monolingual semantic dictionary and bilingual dictionary are introduced, and the source language sentences are translated by analogy.

The following figure is a structure diagram of an example-based machine translation method.
The basic idea is to divide the source language sentences into phrase fragments seen in translation examples, and then through analogy, according to sentence similarity (such as string-based matching, word-based matching or vector space model-based matching, etc.) and semantic dictionaries, examples similar to sentence fragments to be translated (possibly multiple instances) are retrieved from the case base, and the phrase fragments are matched with the examples to get similar example sentences.
Then carry out a series of appropriate replacement operations (sometimes including deletion and insertion operations) to get the translation of the phrase fragments, and finally assemble all the fragments into the target translation sentences by means of reorganization, adjustment and so on.

It can be seen that the case-based machine translation method does not need complex language analysis of sentences and can directly make use of the existing translation case base.
The disadvantage is that there is no whole sentence optimization method or optimization mathematical model to guide the selection of examples and find the optimal translation.
Secondly, the method uses sentence-level case matching with coarse granularity, and can not make good use of phrase and context information for fine matching. therefore, the clairvoyant translation case base can not be fully used to expand the sentence coverage of case matching (for example, translation information of some words or phrases can also be provided in many examples with low similarity).
Therefore, this method can show its advantages only when it can find examples with high similarity.

3. Statistical machine translation methods.

In the early 1990s, IBM researchers pioneered a corpus-based statistical machine translation method ((Brown et al.,1992; Brown et al.,1993).
Different from the example-based machine translation method, the statistical machine translation method automatically learns the translation knowledge between fine-grained phrase fragments of the source language and the target language.
In the process of translation, the statistical machine translation method first divides the source language sentences into phrase fragments.
Then the phrase translation knowledge table based on bilingual corpus is used to transform all the phrase fragments of the source language into appropriate target phrase fragments.
At the same time, the target phrase fragments are sorted reasonably, and finally a complete target translation is generated.
The statistical machine translation method mathematically models the translation process of the whole source language sentence to form a probability model.
There are corresponding probability calculation values for the occurrence of phrase fragment conversion and sequence adjustment in the process of translation.
Different source language phrase segment segmentation methods, different phrase fragment conversion results and different target phrase fragment order adjustment come together to form a huge search space.
The statistical machine translation method finds the path with the highest probability in this search space, and the target language sentence formed by the corresponding operation is the final optimal translation.
The advantage of the statistical machine translation method is that it introduces a mathematical model, which can optimize the translation objectives and guide the operation in the translation process in the direction of producing the optimal translation.
It no longer relies on manual compilation of translation rules and can automatically learn fine-grained phrase-level translation knowledge.
In addition, this method is obviously better than rule-based and example-based machine translation methods in terms of robustness and expansibility. it can not only deal with language ambiguity according to probability values, but also quickly construct a translation system based on the existing corpus, and the translation performance can be improved automatically by adding training corpus.

Knowledge guidance of Statistical Machine Translation

4. The method of neural machine translation.

The neural machine translation method uses a neural network to directly transform a source language sentence into a target language sentence, specifically, an encoder is used to transform the source language sentence into a vector, which forms a distributed representation of the source language sentence. Then based on this vector representation, the decoder is used to generate the target word sequence step by step until the whole target language sentence is generated.
The characteristic of this method is that the whole translation process is an end-to-end calculation process, but it is composed of vector-based numerical calculation, so it is difficult to analyze the results of the intermediate process from the perspective of linguistics.
The advantage of neural machine translation is that it can make full use of the contextual information of sentence content to generate translation with high quality, especially high fluency.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store