#1: EfficientZero, Learning Informative Rewards, MuJoCo Becomes Free, RL Lecture Series

Enes Bilgin
RL Agent
Published in
Sent as a

Newsletter

3 min readNov 3, 2021

The Inaugural Issue of the Reinforcement Learning Newsletter

Reinforcement Learning is one of the hottest areas in AI/ML. There is explosive growth in the number of research papers published, application stories released, open-source libraries created, and technology investments made related to RL. As a result, it is easy to miss important developments and get lost in the noise. That is why we are excited to curate, distill, and share the most important developments in RL with you via this bi-weekly newsletter. Subscribe to stay up to date and spread the word in your network.

Enjoy!

Number of RL papers published at the NeurIPS conference until 2019. This data is originally curated by Katja Hofmann for her NeurIPS 2019 presentation. The plot is from the book “Mastering Reinforcement Learning with Python”. This interest in RL increasingly continues today in academia and industry.

EfficientZero: Mastering Atari Games with Limited Data

In an impressive recent paper, researchers from Tsinghua University, UC Berkeley, and Shanghai Qi Zhi Institute were able to significantly exceed human performance on the Atari 100k benchmark with only two hours of real-time game experience, a first in the RL literature. They call their model EfficientZero.

Photo by Senad Palic on Unsplash

The authors also highlight that “EfficientZero’s performance is also close to DQN’s performance at 200 million frames while we consume 500 times less data.” The algorithm uses a model-based approach combined with the Monte Carlo Tree Search on image observations.

MURAL: Making RL Tractable by Learning More Informative Reward Functions

Reward function design is one of the most significant challenges in RL, which is also closely related to the exploration problem in RL. UC Berkeley researchers in their recent work use “method for learning uncertainty-aware rewards for RL” or MURAL.

MURAL diagram

MURAL leverages a scheme for training uncertainty-aware classifiers via conditional normalized maximum likelihood rather than standard classifiers to define a class of tractable RL problems.

MuJoCo Becomes Free with DeepMind Acquisition

DeepMind recently announced its acquisition of the famous simulation software MuJoCo, a go-to package for physics-related RL research and applications. With that, MuJoCo becomes freely available to everyone, which used to cost at least $3000 per license — great news for us all. DeepMind is also open-sourcing the MuJoCo code at its new GitHub location.

Chelsea Finn on Meta Learning and Model-Based RL

Episode 13 of The Gradient Podcast features Stanford Professor Chelsea Finn — a pioneer in Meta Learning, where she talks about her academic story and recent research related to model-based RL.

Reinforcement Learning Lecture Series 2021 by DeepMind and UCL

DeepMind and University College London (UCL) releases a grad-level, comprehensive course on introduction to modern reinforcement learning. The course consists of 13 lectures from the Markov Decision Process basics to practical Rainbow DQN implementation.

If you have found this newsletter useful, consider subscribing, following us on Twitter, and sharing it with your network. If you are interested in contributing stories, reach out to us at editor@rlagent.pub.

--

--

Enes Bilgin
RL Agent

Deep RL @ Microsoft Autonomous Systems | Author of therlbook.com | Advisor @ CSU Engineering Leadership Program