#5: GPT-3 Gets Better with RL, Hugging Face & Stable-baselines3, Meet Evolution Gym, Offline RL’s Tailwinds
OpenAI Releases InstructGPT, Enabling GPT-3 to Follow Instructions
OpenAI has fine-tuned GPT-3 using reinforcement learning from human feedback to make it better at following instructions, and the results are impressive! The new model, called InstructGPT, is surprisingly good at understanding the intent in user prompts and generating effective answers.
Hugging Face Integrates Stable-Baselines3 to the Hugging Face Hub
Hugging Face, popular for its NLP library, takes on RL by integrating Stable-Baselines3 to its Hub. Stable Baselines is well known as an RL package containing PyTorch implementations of widely used Deep RL algorithms, improved upon OpenAI’s Baselines. Hugging Face’s this move comes after they announced their first ML-Agents environment: Snowball Fight.
Meet Evolution Gym for Soft Robots
Evolution Gym is the first large-scale benchmark for co-optimizing the design and control of soft robots, a challenging undertaking in robotics and control. The benchmark set includes tasks such as walking, object manipulation, climbing, and locomotion. Evolution Gym was first introduced at the latest NeurIPS and the authors have recently open-sourced the implementation.
Offline RL’s Tailwinds
The interest in Offline RL continues with new publications. Here are some papers you should take a look at:
Number of Resources on RL Upticks
As RL gains popularity, more and more resources become available. Here is a quick curated list on some recent resources.
Other Recent Publications You Might Find Interesting
- Automated Reinforcement Learning (AutoRL): A Survey and Open Problems
- Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning
- Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble
- Approximate reinforcement learning to control beaconing congestion in distributed networks
- Reinforcement Learning Methods in Public Health
- Amazon Ads reinforcement learning research
- Modeling Complex Networks Based on Deep Reinforcement Learning
- UC Berkeley Researchers Introduce the Unsupervised Reinforcement Learning Benchmark (URLB)
- Deep Reinforcement Learning-Based Workload Scheduling for Edge Computing
- Run Time Assured Reinforcement Learning for Safe Satellite Docking
- Reinforcement Learning as a fine-tuning paradigm
Academic Positions
- PostDoc/Ph.D. in Deep RL and Social Dilemmas in ‘the Metaverse’, The Interdisciplinary Center, Herzliya (Reichman University), Israel by Prof. Doron Friedman
- 2 Ph.D. positions in Reinforcement Learning at TU Darmstadt with the Intelligent Autonomous Systems of Prof. Jan Peters
- Research fellow in Epistemic Artificial Intelligence
If you have found this newsletter useful, consider subscribing on Medium and LinkedIn, following us on Twitter, and sharing it with your network. If you are interested in contributing stories or have academic positions to feature, reach out to us at editor@rlagent.pub.