The Wild Week in AI — Self-Driving car kills pedestrian; Random Search vs. Deep RL; MCTS Tutorial; And more;

Original from the newsletter at https://www.getrevue.co/profile/wildml/issues/the-wild-week-in-ai-self-driving-car-kills-pedestrian-random-search-vs-deep-rl-mcts-tutorial-and-more-100531

If you enjoy the newsletter, please consider sharing it on Twitter, Facebook, etc! Really appreciate the support :)


News

Self-driving Uber kills Arizona pedestrian in a fatal crash

www.theguardian.com

Tempe police said the Uber car was in autonomous mode at the time of the crash and that the vehicle hit a woman who later died at a hospital. This is the first fatal self-driving car crash involving a pedestrian.

SambaNova System raises $56M for AI hardware

techcrunch.com

The startup is founded by the two Stanford professors Kunle Olukotun and Chris Ré, and led by former Oracle SVP of development Rodrigo Liang. Olukotun and Liang wouldn’t go into the specifics of the architecture, but they are looking to redo the operational hardware to optimize for the AI-centric frameworks that have become increasingly popular in fields like image and speech recognition.

Release: TensorFlow 1.7.0 RC1 with Eager Execution

github.com

The Eager Execution mode is now moving out of contrib package into the core of Tensorflow. Other changes include easier computation of custom gradients, a graphical Tensorflow debugger, and SQLite datasets.

Skyline AI raises $3M

techcrunch.com

Skyline AI is an Israeli startup that uses machine learning to help real estate investors identify promising properties. It announced that it has raised $3 million in seed funding from Sequoia Capital.

Posts, Articles, Tutorials

Random Search vs. Model-Free RL

www.argmin.net

Can simple random search outperform Reinforcement Learning algorithms on benchmark problems such as MuJoCo? The answer is yes. Also take a look at the corresponding paper.

Monte Carlo Tree Search Beginners guide

int8.io

An in-depth introduction to Monte Carlo Tree Search (MCTS) which is used in many board game agents, including chess engines and AlphaGo. Its purpose is to choose the most promising next move given the current game state.

The Machine Learning Reproducibility Crisis

petewarden.com

Reproducibility is hard. In Machine Learning we’re still in the dark ages when it comes to tracking changes and rebuilding models. This post lays out some of the challenges and how we may approach them.

Understanding Deep Learning through Neuron Deletion

deepmind.com

A group of researchers from DeepMind measured the performance impact of damaging the network by deleting individual neurons as well as groups of neurons. They found that interpretable neurons are no more important than confusing neurons with difficult-to-interpret activity, and that networks which correctly classify unseen images are more resilient to neuron deletion than networks which can only classify images they have seen before.

Code, Projects & Data

RL-Adventure: Pytorch Deep Q Learning tutorials

github.com

A collection of Deep Q Learning algorithms implemented in PyTorch and Jupyter notebooks with clean and readable code. This repository is a good starting point to understand the differences between the various algorithms.

How to train a neural coreference model

medium.com

This blog post walks you through how a coreference resolution system works and how to train it using the CoNLL 2012 dataset. The full code is available on Github.

Stochastic Weight Averaging in PyTorch

github.com

This repository contains a PyTorch implementation of the Stochastic Weight Averaging (SWA) training method for DNNs from the paper Averaging Weights Leads to Wider Optima and Better Generalization.

LabNotebook: Monitor machine learning experiments

github.com

LabNotebook is a pure Python tool that allows you to monitor, record, save, and query all machine learning experiments. The library looks promising but is in an alpha version state.

Highlighted Research Papers

[1803.07055] Simple random search provides a competitive approach to reinforcement learning

arxiv.org

The authors introduce a random search method for training static, linear policies for continuous control problems, matching state-of-the-art sample efficiency on the benchmark MuJoCo locomotion tasks. The search algorithm is at least 15 times more efficient than the fastest competing model-free methods on these benchmarks.

[1803.08240] An Analysis of Neural Language Modeling at Multiple Scales

arxiv.org

Many of the leading approaches in language modeling introduce novel, complex and specialized architectures. The authors take existing state-of-the-art word level language models based on LSTMs and QRNNs and extend them to both larger vocabularies as well as character-level granularity. When properly tuned, LSTMs and QRNNs achieve state-of-the-art results on character-level (Penn Treebank, enwik8) and word-level (WikiText-103) datasets, respectively. Results are obtained in only 12 hours (WikiText-103) to 2 days (enwik8) using a single modern GPU.

[1803.03835] Kickstarting Deep Reinforcement Learning

arxiv.org

Using previously-trained ‘teacher’ agents to kickstart the training of a new ‘student’ agent. The authors show that on a computationally-intensive multi-task benchmark (DMLab-30), kickstarted training improves the data efficiency of new agents, allowing for faster iteration. The same kickstarting pipeline can allow a single student agent to leverage multiple ‘expert’ teachers which specialize in individual tasks. In this setting, the kickstarted agent matches the performance of an agent trained from scratch in almost 10x fewer steps and surpasses its final performance by 42 percent.