Robot Learning Restarting

Published in

IntelligentUnit

2 min readJul 5, 2018

After more than one year’s exploration on Deep Reinforcement Learning, Meta Learning, Few-Shot Learning, I finally decide to focus on one specific domain, that is Robot Learning. This is my very first blog of Robot Learning on Medium, and it is my studying roadmap in fact. Since I do not have too many followers on Medium, I am willing to share my detail thinking and studying process here. Hope to make more friends on Medium.

The Goal

To Develop a robust vision based robot manipulation system so as to do a variety of real world daily or industrial manipulation tasks.

Key intelligent features of the desired system:

1) Enable to grasp/manipulate unseen objects
2) Enable to learn new manipulation tasks via few shot human demonstrations
3) Enable to do multi tasks via language grouding.
4) Enable to memorize and to do continual learning.

1. Master Key Robot Learning RL Algorithms:

1) TRPO,PPO
2) DDPG,D4PG,TD3
3) Soft Q Learning,SAC

Understanding all the details and reimplement all above algorithms in PyTorch or Tensorflow.

Paper List:

[1] Schulman, John, et al. “Trust region policy optimization.” _International Conference on Machine Learning_. 2015.

[2] Schulman, John, et al. “Proximal policy optimization algorithms.” _arXiv preprint arXiv:1707.06347_ (2017).

[3] Lillicrap, Timothy P., et al. “Continuous control with deep reinforcement learning.” _arXiv preprint arXiv:1509.02971_(2015).

[4] Horgan, Dan, et al. “Distributed prioritized experience replay.” _arXiv preprint arXiv:1803.00933_ (2018).

[5] Barth-Maron, Gabriel, et al. “Distributed Distributional Deterministic Policy Gradients.” _arXiv preprint arXiv:1804.08617_ (2018).

[6] Fujimoto, Scott, Herke van Hoof, and Dave Meger. “Addressing Function Approximation Error in Actor-Critic Methods.” _arXiv preprint arXiv:1802.09477_ (2018).

[7] Haarnoja, Tuomas, et al. “Reinforcement learning with deep energy-based policies.” _arXiv preprint arXiv:1702.08165_(2017).

[8] Haarnoja, Tuomas, et al. “Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor.” _arXiv preprint arXiv:1801.01290_ (2018).

2. Follow Latest works on Robot Learning including Meta Learning, Imitation Learning and Reinforcement Learning:

1) Imitation Learning

[1] GAIL

[2] Third Person Imitation Learning

[3] Robust Imitation of Diverse Behaviors

…

2) Meta Learning

[1] Diversity is all you need

[2] Unsupervised Meta-Learning for Reinforcement Learning

3. Focus on the domain of Robot Grasping:

[1] Dex-Net 3.0: Computing Robust Robot Suction Grasp Targets using a New Analytic Model and Deep Learning

[2] QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation

[3] Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods

[4] Sim2Real View Invariant Visual Servoing by Recurrent Control

Reimplement the experiments with simulated environment like Mujoco or PyBullet simulated Robot Arm.

Robot Learning Restarting

The Goal

1. Master Key Robot Learning RL Algorithms:

2. Follow Latest works on Robot Learning including Meta Learning, Imitation Learning and Reinforcement Learning:

3. Focus on the domain of Robot Grasping:

Written by Flood Sung