Podcast: Top AI Breakthroughs with Ian Goodfellow & Richard Mallah (FLI)

Incredibly useful summary of major 2016 advancements explaining underlying concepts and future directions

Jacob Younan
AI From Scratch
5 min readMar 1, 2017

--

Wow. This may be the best content I’ve come across yet that both builds foundational understanding and got me excited. Before I go any further, thank you Future of Life Institute (FLI) for putting this together!

Why is this so useful?

  • Clearly articulates why breakthroughs occurred without assuming you have significant prior understanding of the topic areas
  • Delivered by recognized experts who were measured in their enthusiasm, honest about limitations and concerns
  • Referenced research papers along the way that speak to the moment in time when the breakthrough occurred and allow you go deeper

What Does It Cover?

Start to 16:30: Origination of neural nets and barriers to progress from the 50s through to 2006. Then discuss progress from three layer deep neural nets through to nets with 1000s of layers, with particular progress from 2012–2015.

16:30–21:22: DeepMind’s Alpha Go explained, with particular focus on why Go is such a challenging game. A system that evaluates every possible move or scenario within the 100+ squares of the game would be infeasible in real-time, so instead networks are designed to analyze the state of a game at a given time to evaluate probability of winning and quickly assess a small set of potential moves (and subsequent moves) that would optimize chances of winning. Playing itself over thousands of iterations led to its superhuman skill.

21:22–25:34: AlphaGo playing itself to learn, then transitions into the concept Generative Adversarial Networks (GANs), where they outline the roles of two networks (adversaries): the generator and discriminator network. The generator creates outputs (i.e. an image) while the discriminator attempts to determine the probability this output came from the training set. The goal is for the generator to essentially fool the discriminator. Through enough trial and error, the generator produces incrementally more convincing outputs as it implicitly learns what the discriminator’s criteria are per se.

25:34–31:30: Dive into Conditional Generative models, quickly describing them as models that generate outputs based on an input that acts as high-level instructions/constraints. A bunch of examples provided (not sure all are Conditional GAN) include DeepMind’s WaveNet, creating human-like voices from a text input (though apparently quite slow today) and the creation of images from text inputs. In 2014, it was a big deal to generate a caption based on an image, now generating photo-realistic images from a caption is a reality thanks to a model called StackGAN. This blew my mind:

Videos aren’t far off either, with first steps already being taken here. Generative models are also being to taken into pharmaceuticals, with teams like the one at InSilico looking to create new molecules for drug discovery at a more efficient rate than the industry standard.

31:30–34:30: Quickly reference Auto Machine Learning (Auto ML), a concept whose application has seen sharp increases in 2016 with researchers using machine learning to improve neural network architectures. Essentially ‘learning to learn’. OpenAI and the Google Brain team has had particular success at using reinforcement learning in this domain in 2016. To me, this is reminiscent of concepts described by Nick Bostrom and others as recursive self-improvement.

34:30–38:00: The three discuss Google’s Neural Machine Translation (NMT) next. I’ve linked to a helpful NYT article that walks through the story in greater detail here. What’s missing from Times story and may be more exciting here is the concept of ‘interlingua’ — the data Google’s models create that tie concepts and meaning together when translating one language into another. I think of it as a language-independent structure that represents the meaning of words and phrases. What this lets Google do is translate language pairs, say Arabic and Japanese, that don’t have great training data sets, which would limit the languages NMT could support. This Google Research post summarizing it is fascinating. This idea of non-language specific representations is really exciting and reminds me of Cortical’s Semantic Fingerprinting.

Data visualization of Google’s interlingua

38:00–41:15: Next Ian and Richard speak to how developing a broad conceptual framework like an interlingua is applied in other spaces. Essentially, training a model that can learn general rules for applying its behaviors in situations that are similar, but not identical to its training situations (i.e. unusual language pairs). Ian highlights OpenAI’s Universe project as an example — a model trained on specific games in a given genre, say racing, that can then apply its learned behaviors to racing games it’s never played. He also mentions ambitions to take this into learning how to use internet browsers.

41:15–45:00: They then tackle a bunch of other notable topics from 2016 including the advancement of machine learning security, where researchers have begun to identify means of tricking certain systems, such as wearing specific colored-glasses that can fool facial recognition systems. This is viewed as a positive as organizations look to close loop holes that could become problematic vulnerabilities.

45:00–50:50: Ian and Richard then speak to 2017 as the year where we’re likely to see continued emphasis on unsupervised learning, with the goal to remove the limits created by the need for massive labeled data sets. The thought is this will considerably widen applications for machine learning, democratize it to those without critical data sets and create a clearer path to general intelligence and problem solving ability. There’s also a discussion here about the need to begin using ML techniques to identify fake news and other fake content (images, video, voice), that will be more easily created by advances in GANs discussed earlier in the conversation.

50:50-End: The podcast closes with future areas of concern and excitement. Two call-outs on concerns are the impact of automation on income inequality and the speed at which we’re closing on general intelligence. Neither discuss ongoing work on solutions in these areas.

The future positives center on potential leaps in healthcare and the removal of drudgery from people’s everyday lives, particularly at work.

--

--