#5: GPT-3 Gets Better with RL, Hugging Face & Stable-baselines3, Meet Evolution Gym, Offline RL’s Tailwinds

Enes Bilgin

Follow

Published in

RL Agent

Sent as a

Newsletter

3 min readFeb 2, 2022

--

OpenAI Releases InstructGPT, Enabling GPT-3 to Follow Instructions

OpenAI has fine-tuned GPT-3 using reinforcement learning from human feedback to make it better at following instructions, and the results are impressive! The new model, called InstructGPT, is surprisingly good at understanding the intent in user prompts and generating effective answers.

An example InstructGPT response in comparison to GPT-3. Source: OpenAI Blog

Hugging Face Integrates Stable-Baselines3 to the Hugging Face Hub

Hugging Face, popular for its NLP library, takes on RL by integrating Stable-Baselines3 to its Hub. Stable Baselines is well known as an RL package containing PyTorch implementations of widely used Deep RL algorithms, improved upon OpenAI’s Baselines. Hugging Face’s this move comes after they announced their first ML-Agents environment: Snowball Fight.

Meet Evolution Gym for Soft Robots

Evolution Gym is the first large-scale benchmark for co-optimizing the design and control of soft robots, a challenging undertaking in robotics and control. The benchmark set includes tasks such as walking, object manipulation, climbing, and locomotion. Evolution Gym was first introduced at the latest NeurIPS and the authors have recently open-sourced the implementation.

An Example Evolution Gym Environment. Source: Evolution Gym GitHub Repo.

Offline RL’s Tailwinds

The interest in Offline RL continues with new publications. Here are some papers you should take a look at:

Number of Resources on RL Upticks

As RL gains popularity, more and more resources become available. Here is a quick curated list on some recent resources.

Other Recent Publications You Might Find Interesting

Call for Papers

Academic Positions

If you have found this newsletter useful, consider subscribing on Medium and LinkedIn, following us on Twitter, and sharing it with your network. If you are interested in contributing stories or have academic positions to feature, reach out to us at editor@rlagent.pub.

#5: GPT-3 Gets Better with RL, Hugging Face & Stable-baselines3, Meet Evolution Gym, Offline RL’s Tailwinds

Written by Enes Bilgin