Advances in NLP in 2017 (part II)

Unsupervised Machine Translation

Auto-Encoder (left); Encoder-Decoder (right)
  • BLEU — the BLEU score measures how many words and ngrams (n consecutive words) overlap in a given translation and a reference translation. The most commonly used BLEU version is BLEU-4, which considers words, bigrams, trigrams and 4-grams. It also uses a penalty for too short translations.
Two Auto-Encoders and Crossover
  • GAN — the Generative Adversarial Network. The idea of GAN could be expressed as “a network is playing with itself and trying to deceive”. There are three main components in GANs: Generator — to produce representations of some input in a way that such representations resembles ground truth as much as possible, Golden Source — to produce ground truth, and Discriminator — to tell where its input comes from: Generator or Golden Source, and we “punish” Generator if it can. Vise versa, we “punish” Discriminator if it can’t, so both Generator and Discriminator are getting better during the training.
Two Vector Spaces Are Pulled on Each Other

Controllable Text Generation

Simple Recurrent Unit

SRU equations
SRU testing results





AI Research at Deep Learning lab @ MIPT

Valentin Malykh

AI Research at Deep Learning lab @ MIPT

