NIPS 2017 Overview

Taras Sereda
BuzzRobot
Published in
5 min readDec 17, 2017

This year NIPS is huge, educational, with a spirit of exploration and alchemy, as well as a really crowded event.

Registration line, after first tutorial!
Quantum computer

First of all, here is a repo which collects available slides, videos and code from NIPS2017 events! https://github.com/hindupuravinash/nips2017/blob/master/README.md#tutorials

Tutorials

Deep Learning:Practice and Trends. I’ve attended to the “trends part”, what I’ve seen amused me, first of all I wouldn’t consider this tutorial as a one for beginners (that’s what abstract said about this event). Well structured, product oriented material for Deep Learning practitioners. Main trends are: Domain alignment/adaptation; Graph based Neural Networks, Program Induction and others. Each trend was analyzed from the perspective of: I/O, model architecture and losses. Oriol Vinyals also describes major Mete Learning approaches:

Video: https://www.youtube.com/watch?v=YJnddoa8sHk

Engineering and Reverse-Engineering Intelligence Using Probabilistic Programs, Program Induction, and Deep Learning.

Probablistic programming is great, though underused right now, may be due to it’s complexity. It gives you a way for describing a problem in a probablistic fashion, which is a natural representation of real world. You can encode uncertainty in your parameters, model inputs and outputs and get a set of possible execution traces of the programm.

This field actively evolves, and a plethora of tools is available at your disposal. I think that using ideas from probablistic programming field could help a lot in tackling the issue of adversarial examples.

Future directions of the filed includes:

  • integration with Deep Learning
  • building fast runtime for Monte Carlo inference (“BLAS” for Monte Carlo)

Powering the next 100 years. Invited talk by John Platt.

Inspiring ideas and analytics about importance of fusion as a source of energy. Also author provide a link to interactive tools where you can play with different proportions of energy sources and their combination to fulfill the constraints. https://google.github.io/energystrategies

A lot should be done in order to make fusion successull. First of all it should bring more energy then it consumes, this process should be scalable and the most important factor for the end user it should be afordable. The estimates are that by the 2020 some of this prerequisites should be fullfilled.

Highlighted posters.

Disentangled representations
Paired-data image translation
Planning, alternative of attention mechanism
Hierarchical embeddings in Hyperbolic space.
Attention is all you need

Workshops

Machine Learning for Audio Signal Processing (ML4Audio)

Great work shop for audio researchers.

Acoustic word embeddings. Goal — build a representation in some high dimensional space where similar spoken words should have low distances between each other. This problem is much more complex than in text domain. Since variety of how people speak one word is huge! The corresponding waveform differs from speaker to speaker as well as it depends on prosody and timber. LSTMs areused for generating a fixed length vector representation of a spoken word and contrastive loss techniques used in order to model embedding space. Also this approach is multiview, meaning that character and acoustic representations are modeled jointly.

paper: https://arxiv.org/abs/1611.04496

code: https://github.com/opheadacheh/Multi-view-neural-acoustic-words-embeddings

Guys from Google show modifications on top of the Tacotorn TTS model, which allow to capture style of the speaker. The main idea is to learn a “style atomic vocabulary” and use linear combination of “style-atoms” to form a style vector as a condition for generation next timestep of the waveform. The conditioning is implemented via attention mechanism. Fully connected layer is used to decide in which ration text attention and style attention should be mixed, by simply prediction a scalar weight for style vectors.

Samples are available: https://google.github.io/tacotron/publications/uncovering_latent_style_factors_for_expressive_speech_synthesis/index.html

Machine Learning for Creativity and Design. I’ll show some posters only.

Learning Disentangled Features: from Perception to Control

This workshop — is an important topic towards controlable generative models, 1 year ago only InfoGANs were presented(to my knowledge), now a lot more ideas are developed. One of them is ß-VAE when KL loss term is weighted in order to constraint information bottelneck being dissentangeled, passing only that information which encodes factors of variation in data.

This field is quite new, there is no common way for measuring level of disentanglement, guys from DeepMind propose a modification of ß-VAE as well as a new disentanglement metric.

All in all this year NIPS was great! With awesome party in the end. Cheers!

--

--