Advances in NLP in 2017 (part II)

Unsupervised Machine Translation

Auto-Encoder (left); Encoder-Decoder (right)
  • BLEU — the BLEU score measures how many words and ngrams (n consecutive words) overlap in a given translation and a reference translation. The most commonly used BLEU version is BLEU-4, which considers words, bigrams, trigrams and 4-grams. It also uses a penalty for too short translations.
Two Auto-Encoders and Crossover
  • GAN — the Generative Adversarial Network. The idea of GAN could be expressed as “a network is playing with itself and trying to deceive”. There are three main components in GANs: Generator — to produce representations of some input in a way that such representations resembles ground truth as much as possible, Golden Source — to produce ground truth, and Discriminator — to tell where its input comes from: Generator or Golden Source, and we “punish” Generator if it can. Vise versa, we “punish” Discriminator if it can’t, so both Generator and Discriminator are getting better during the training.
Two Vector Spaces Are Pulled on Each Other

Controllable Text Generation

Controllable Text Generation

Simple Recurrent Unit

SRU equations
SRU testing results

Conclusion

--

--

--

AI Research at Deep Learning lab @ MIPT

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Re-using machine learning models and the “no free lunch” theorem

Andrew Ng Youtube

Day 4: Binary Logistic Regression

A single variable logistic regression

“Dynamic Selection of Fitness Function in Genetic Algorithm for Feature Selection in Software…

LiDAR Sensor Modeling and Data Augmentation with CycleGAN

INTRODUCTION TO ML

Bonsai: a Machine Teaching Startup

Neural Networks Explained

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Valentin Malykh

Valentin Malykh

AI Research at Deep Learning lab @ MIPT

More from Medium

Detection and Normalization of Temporal Expressions in French Text (2) — Label Format and…

Conversion of NER based Pytorch model to ONNX

Jigsaw Unintended Bias in Toxicity Classification

Pytorch nifti image data loader