#5: GPT-3 Gets Better with RL, Hugging Face & Stable-baselines3, Meet Evolution Gym, Offline RL’s Tailwinds

Enes Bilgin
RL Agent
Published in
Sent as a

Newsletter

3 min readFeb 2, 2022

OpenAI Releases InstructGPT, Enabling GPT-3 to Follow Instructions

OpenAI has fine-tuned GPT-3 using reinforcement learning from human feedback to make it better at following instructions, and the results are impressive! The new model, called InstructGPT, is surprisingly good at understanding the intent in user prompts and generating effective answers.

An example InstructGPT response in comparison to GPT-3. Source: OpenAI Blog

Hugging Face Integrates Stable-Baselines3 to the Hugging Face Hub

Hugging Face, popular for its NLP library, takes on RL by integrating Stable-Baselines3 to its Hub. Stable Baselines is well known as an RL package containing PyTorch implementations of widely used Deep RL algorithms, improved upon OpenAI’s Baselines. Hugging Face’s this move comes after they announced their first ML-Agents environment: Snowball Fight.

Meet Evolution Gym for Soft Robots

Evolution Gym is the first large-scale benchmark for co-optimizing the design and control of soft robots, a challenging undertaking in robotics and control. The benchmark set includes tasks such as walking, object manipulation, climbing, and locomotion. Evolution Gym was first introduced at the latest NeurIPS and the authors have recently open-sourced the implementation.

An Example Evolution Gym Environment. Source: Evolution Gym GitHub Repo.

Offline RL’s Tailwinds

The interest in Offline RL continues with new publications. Here are some papers you should take a look at:

Photo by Mika Baumeister on Unsplash

If you have found this newsletter useful, consider subscribing on Medium and LinkedIn, following us on Twitter, and sharing it with your network. If you are interested in contributing stories or have academic positions to feature, reach out to us at editor@rlagent.pub.

--

--

Enes Bilgin
RL Agent

Deep RL @ Microsoft Autonomous Systems | Author of therlbook.com | Advisor @ CSU Engineering Leadership Program