Resources for Deep Reinforcement Learning

13 min readSep 16, 2018

Last updated: December 28, 2018

This is a collection of resources for deep reinforcement learning, including the following sections: Books, Surveys and Reports, Courses, Tutorials and Talks, Conferences, Journals and Workshops, Blogs, and, Benchmarks and Testbeds. This blog is very long, with lots of resources. See Table of Contents.

This blog is based on Deep Reinforcement Learning: An Overview. These resources are about reinforcement learning core elements, important mechanisms, and applications, as in the overview, also include topics for deep learning, reinforcement learning, machine learning, and, AI. I compile this blog to complement the above book draft, for flexible updates.

If pick three study materials:

David Silver, Reinforcement Learning, 2015. Slides. Video.
Sergey Levine, UC Berkeley CS 294: Deep Reinforcement Learning
Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd Edition). MIT Press.

Two new ones came out recently:

Pick three survey papers:

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature, 521:436–444.
Jordan, M. I. and Mitchell, T. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245):255–260.
Littman, M. L. (2015). Reinforcement learning improves behaviour from evaluative feedback. Nature, 521:445–451.

There are excellent invited talks, tutorials, workshops in recent conferences, like NIPS, ICML, ICLR, ACL, CVPR, AAAI, IJCAI, etc. Many of them are not included here.

Books

Reinforcement Learning:

Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd Edition). MIT Press. The definitive and intuitive reinforcement learning book. Accompanying Lectures, Python code.
Szepesvári, C. (2010). Algorithms for Reinforcement Learning. Morgan & Claypool.
Bertsekas, D. P. (2019). Reinforcement Learning and Optimal Control (draft). Athena Scientific.
Bertsekas, D. P. (2012). Dynamic programming and optimal control (Vol. II, 4th Edition: Approximate Dynamic Programming). Athena Scientific.
Bertsekas, D. P. and Tsitsiklis, J. N. (1996). Neuro-Dynamic Programming. Athena Scientific.
Powell, W. B. (2011). Approximate Dynamic Programming: Solving the curses of dimensionality (2nd Edition). John Wiley and Sons.
Wiering, M. and van Otterlo, M., editors (2012). Reinforcement Learning: State-of-the-Art. Springer.
Puterman, M. L. (2005). Markov decision processes : discrete stochastic dynamic programming. Wiley-Interscience.
Lattimore, T. and Szepesvári, C. (2018). Bandit Algorithms. Cambridge University Press.

Deep Learning

Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.

Machine Learning

Bishop, C. (2011). Pattern Recognition and Machine Learning. Springer.
Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.
Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. The MIT Press.
Zhou, Z.-H. (2016). Machine Learning (in Chinese). Tsinghua University Press, Beijing, China.
Mitchell, T. (1997). Machine Learning. McGraw Hill.
James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer.
Kuhn, M. and Johnson, K. (2013). Applied Predictive Modeling. Springer.
Provost, F. and Fawcett, T. (2013). Data Science for Business. O’Reilly.
Simeone, O. (2017). A Brief Introduction to Machine Learning for Engineers. ArXiv.
Vapnik, V. N. (1998). Statistical Learning Theory. Wiley.
Haykin, S. (2008). Neural Networks and Learning Machines (third edition). Prentice Hall.

Causality

Pearl, J. (2009). Causality. Cambridge University Press.
Pearl, J., Glymour, M., and Jewell, N. P. (2016). Causal Inference in Statistics: A Primer. Wiley.
Pearl, J. and Mackenzie, D. (2018). The Book of Why: The New Science of Cause and Effect. Basic Books.
Peters, J., Janzing, D., and Schölkopf, B. (2017). Elements of Causal Inference: Foundations and Learning Algorithms. MIT Press.

Natural Language Processing (NLP)

Jurafsky, D. and Martin, J. H. (2017). Speech and Language Processing (3rd ed. draft). Prentice Hall.
Goldberg, Y. (2017). Neural Network Methods for Natural Language Processing. Morgan & Claypool.
Deng, L. and Liu, Y., editors (2018). Deep Learning in Natural Language Processing. Springer.

Semi-supervised Learning

Zhu, X. and Goldberg, A. B. (2009). Introduction to semi-supervised learning. Morgan & Claypool.

Learning to learn

Hutter, F., Kotthoff, L., and Vanschoren, J., editors (2018). Automatic Machine Learning: Methods, Systems, Challenges. Springer. In press, available at http://automl.org/book.
Chen, Z. and Liu, B. (2016). Lifelong Machine Learning. Morgan & Claypool.

Game Theory

Leyton-Brown, K. and Shoham, Y. (2008). Essentials of Game Theory: A Concise, Multidisciplinary Introduction. Morgan & Claypool.

Finance

Hull, J. C., Options, Futures and Other Derivatives, Prentice Hall.

Transportation

Bazzan, A. L. and Klügl, F. (2014). Introduction to Intelligent Systems in Traffic and Transportation. Morgan & Claypool

Artificial Intelligence

Russell, S. and Norvig, P. (2009). Artificial Intelligence: A Modern Approach (3rd edition). Pearson.

Go to Table of Contents

Surveys and Reports

Reinforcement Learning

Littman, M. L. (2015). Reinforcement learning improves behaviour from evaluative feedback. Nature, 521:445–451.
Kaelbling, L. P., Littman, M. L., and Moore, A. (1996). Reinforcement learning: A survey. JAIR, 4:237–285.
Li, Y. (2017). Deep Reinforcement Learning: An Overview. ArXiv.
Levine, S. (2018). Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review. ArXiv.
Recht, B. (2018). A Tour of Reinforcement Learning: The View from Continuous Control. ArXiv.
Geramifard, A., Walsh, T. J., Tellex, S., Chowdhary, G., Roy, N., and How, J. P. (2013). A tutorial on linear function approximators for dynamic programming and reinforcement learning. Foundations and Trends® in Machine Learning, 6(4):375–451.
Grondman, I., Busoniu, L., Lopes, G. A., and Babuška, R. (2012). A survey of actor-critic reinforcement learning: Standard and natural policy gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(6):1291–1307.
Roijers, D. M., Vamplew, P., Whiteson, S., and Dazeley, R. (2013). A survey of multi-objective sequential decision-making. JAIR, 48:67–113.

Deep Learning

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature, 521:436–444.
Poggio, T., Mhaskar, H., Rosasco, L., Miranda, B., and Liao, Q. (2017). Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review. International Journal of Automation and Computing, 14(5):503–519.
Bengio, Y., Courville, A., and Vincent, P. (2013). Representation learning: A review and new perspectives. TPAMI, 35(8):1798–1828.
Bengio, Y. (2009). Learning deep architectures for AI. Foundations and trends®in Machine Learning, 2(1):1–127.
Deng, L. and Dong, Y. (2014). Deep learning: Methods and applications. Foundations and Trends® in Signal Processing, 7(3–4):197–387.
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61:85–117.
Wang, H. and Raj, B. (2017). On the Origin of Deep Learning. ArXiv.
Sze, V., Chen, Y.-H., Yang, T.-J., and Emer, J. (2017). Efficient Processing of Deep Neural Networks: A Tutorial and Survey. ArXiv.

Machine Learning

Jordan, M. I. and Mitchell, T. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245):255–260.
Domingos, P. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10):78–87.
Bottou, L., Curtis, F. E., and Nocedal, J. (2018). Optimization methods for large-scale machine learning. SIAM Review, 60(2):223–311.
Ng, A. (2018). Machine Learning Yearning (draft). deeplearning.ai.
Zinkevich, M. (2017). Rules of Machine Learning: Best Practices for ML Engineering.
Andrieu, C., de Freitas, N., Doucet, A., and Jordan, M. I. (2003). An introduction to MCMC for machine learning. Machine Learning, 50(1–2):5–43.

Causality

Pearl, J. (2018). The seven pillars of causal reasoning with reflections on machine learning. UCLA Technical Report R-481.
Guo, R., Cheng, L., Li, J., Hahn, P. R., and Liu, H. (2018). A Survey of Learning Causality with Data: Problems and Methods. ArXiv e-prints.

Graph Neural Networks

Battaglia, P. W., Hamrick, J. B., Bapst, V., et al. (2018). Relational inductive biases, deep learning, and graph networks. ArXiv.
Zhang, Z., Cui, P., and Zhu, W. (2018c). Deep learning on graphs: A survey. ArXiv.
Zhou, J., Cui, G., Zhang, Z., Yang, C., Liu, Z., and Sun, M. (2018a). Graph neural networks: A review of methods and applications. ArXiv.

Exploration

Li, L. (2012). Sample complexity bounds of exploration. In Wiering, M. and van Otterlo, M., editors, Reinforcement Learning: State-of-the-Art, pages 175–204. Springer-Verlag Berlin Heidelberg.

Transfer Learning

Taylor, M. E. and Stone, P. (2009). Transfer learning for reinforcement learning domains: A survey. JMLR, 10:1633–1685.
Pan, S. J. and Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345–1359.
Weiss, K., Khoshgoftaar, T. M., and Wang, D. (2016). A survey of transfer learning. Journal of Big Data, 3(9).

Multi-task Learning

Zhang, Y., , and Yang, Q. (2018). An overview of multi-task learning. National Science Review, 5:30–43.
Ruder, S. (2017). An Overview of Multi-Task Learning in Deep Neural Networks. ArXiv.

Neural Architecture Search

Elsken, T., Hendrik Metzen, J., and Hutter, F. (2018). Neural Architecture Search: A Survey. ArXiv.

Learning to Learn

Chelsea Finn, Learning to Learn with Gradients, PhD thesis, 2018
Vanschoren, J. (2018). Meta-learning: A survey. ArXiv.

Successor Representation

Gershman, S. J. (2018). The successor representation: Its computational logic and neural substrates. Journal of Neuroscience, 38(33):7193–7200.

Bayesian RL

Ghavamzadeh, M., Mannor, S., Pineau, J., and Tamar, A. (2015). Bayesian reinforcement learning: a survey. Foundations and Trends in Machine Learning, 8(5–6):359–483.

Monte Carlo tree search (MCTS)

Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P. I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., and Colton, S. (2012). A survey of Monte Carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in Games, 4(1):1–43.
Gelly, S., Schoenauer, M., Sebag, M., Teytaud, O., Kocsis, L., Silver, D., and Szepesvári, C. (2012). The grand challenge of computer go: Monte carlo tree search and extensions. Communications of the ACM, 55(3):106–113.

Attention and Memory

Olah, C. and Carter, S. (2016). Attention and augmented recurrent neural networks. Distill.
Denny Britz, Attention and Memory in Deep Learning and NLP

Intrinsic Motivation

Barto, A. (2013). Intrinsic motivation and reinforcement learning. In Baldassarre, G. and Mirolli, M., editors, Intrinsically Motivated Learning in Natural and Artificial Systems. Springer, Berlin, Heidelberg.
Schmidhuber, J. (2010). Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development, 2(3):230–247.
Oudeyer, P.-Y. and Kaplan, F. (2007). What is intrinsic motivation? a typology of computational approaches. Frontiers in neurorobotics, 1(6).

Evolution Strategy

Hansen, N. (2016). The CMA Evolution Strategy: A Tutorial. ArXiv.

Robotics

Kober, J., Bagnell, J. A., and Peters, J. (2013). Reinforcement learning in robotics: A survey. International Journal of Robotics Research, 32(11):1238–1278.
Deisenroth, M. P., Neumann, G., and Peters, J. (2013). A survey on policy search for robotics. Foundations and Trend in Robotics, 2:1–142.
Argall, B. D., Chernova, S., Veloso, M., and Browning, B. (2009). A survey of robot learning from demonstration.Robotics and Autonomous Systems, 57(5):469–483.

Natural Language Processing (NLP)

Hirschberg, J. and Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245):261–266.
Cho, K. (2015). Natural Language Understanding with Distributed Representation. ArXiv.
Young, T., Hazarika, D., Poria, S., and Cambria, E. (2017). Recent Trends in Deep Learning Based Natural Language Processing. ArXiv.

Dialogue Systems

Hinton, G., Deng, L., Yu, D., Dahl, G. E., rahman Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., , and Kingsbury, B. (2012). Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 82.
Deng, L. and Li, X. (2013). Machine learning paradigms for speech recognition: An overview. IEEE Transac- tions on Audio, Speech, and Language Processing, 21(5):1060–1089.
Gao, J., Galley, M., and Li, L. (2018). Neural approaches to Conversational AI. Foundations and Trends in Information Retrieval. To appear.
He, X. and Deng, L. (2013). Speech-centric information processing: An optimization-oriented approach. Proceedings of the IEEE | Vol. 101, №5, May 2013, 101(5):1116–1135.
Young, S., Gašić, M., Thomson, B., and Williams, J. D. (2013). POMDP-based statistical spoken dialogue systems: a review. Proceedings of IEEE, 101(5):1160–1179.

Computer Vision

Zhang, Q. and Zhu, S.-C. (2018). Visual interpretability for deep learning: a survey. Frontiers of Information Technology & Electronic Engineering, 19(1):27–39.
Bohg, J., Hausman, K., Sankaran, B., Brock, O., Kragic, D., Schaal, S., and Sukhatme, G. S. (2017). Interactive perception: Leveraging action in perception and perception in action. IEEE Transactions on Robotics, 33(6):1273–1291.

Recommender System

Zhang, S., Yao, L., Sun, A., and Tay, Y. (2017). Deep Learning based Recommender System: A Survey and New Perspectives. ArXiv e-prints.

Healthcare

Chakraborty, B. and Murphy, S. A. (2014). Dynamic treatment regimes. Annual Review of Statistics and Its Application, 1:447–464.

Energy

Anderson, R. N., Boulanger, A., Powell, W. B., and Scott, W. (2011). Adaptive stochastic control for the smart grid. Proceedings of the IEEE, 99(6):1098–1115.

Collection of Applications

Yuxi Li, Reinforcement Learning Applications
Satinder Singh, Successes of Reinforcement Learning
Csaba Szepesvári, RLApplications.bib

AI Safety

Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., and Mané, D. (2016). Concrete Problems in AI Safety. ArXiv.
Garcìa, J. and Fernàndez, F. (2015). A comprehensive survey on safe reinforcement learning. JMLR, 16:1437–1480.

Go to Table of Contents

Courses

Reinforcement Learning

David Silver, Reinforcement Learning, 2015. Slides. Video.
Sergey Levine, UC Berkeley CS 294: Deep Reinforcement Learning
Richard Sutton, Reinforcement Learning, 2016.
Katerina Fragkiadaki, Ruslan Satakhutdinov, Deep Reinforcement Learning and Control, Spring 2017
Emma Brunskill, CS234: Reinforcement Learning
Charles Isbell, Michael Littman and Chris Pryby, Udacity: Reinforcement Learning
Emo Todorov, Intelligent control through learning and optimization
OpenAI Spinning Up in Deep RL
Deep Reinforcement Learning Hands-On

Deep Learning

Andrew Ng and Kian Katanforoosh, Stanford CS230: Deep Learning
Andrew Ng, Deep Learning Specialization
Jeremy Howard, Practical Deep Learning For Coders
Nando de Freitas, Deep Learning Lectures
David Donoho, Hatef Monajemi, and Vardan Papyan, Stanford STATS 385, Theories of Deep Learning

Machine Learning

Andrew Ng, Machine Learning

Robotics

Pieter Abbeel, Advanced Robotics, Fall 2015
Abdeslam Boularias, Robot Learning Seminar
MIT 6.S094: Deep Learning for Self-Driving Cars

Computer Vision

Fei-Fei Li, Justin Johnson, and Serena Yeung, CS231n: Convolutional Neural Networks for Visual Recognition

NLP

Richard Socher, CS224d: Deep Learning for Natural Language Processing
Brendan Shillingford, Yannis Assael, Chris Dyer, Oxford Deep NLP 2017 course

Healthcare

David Sontag, Machine Learning for Healthcare

UC Berkeley CS188 Intro to AI
Andrew Critch and Stuart Russell, UC Berkeley CS 294–149: Safety and Control for Artificial General Intelligence

Go to Table of Contents

Tutorials and Talks

Reinforcement Learning

Rich Sutton, Introduction to Reinforcement Learning with Function Approximation
Rich Sutton, Temporal Difference Learning
Andrew Barto, A history of reinforcement learning
Deep Reinforcement Learning, David Silver, Pieter Abbeel, Sergey Levine and Chelsea Finn
David Silver, Principles of Deep RL
Benjamin Recht, Optimization Perspectives on Learning to Control
John Schulman, The Nuts and Bolts of Deep Reinforcement Learning Research
Joelle Pineau, Introduction to Reinforcement Learning
Deep Learning and Reinforcement Learning Summer School, 2018, 2017
Deep Learning Summer School, 2016, 2015
Yisong Yue and Hoang M. Le, Imitation Learning, ICML 2018 Tutorial

Deep Learning

Andrew Ng, Nuts and Bolts of Building Applications using Deep Learning
Christopher Manning and Russ Salakhutdinov, Introductory Overview Lecture The Deep Learning Revolution, JSM 2108 Tutorial
Sanjeev Arora, ICML 2018 Tutorial on Toward Theoretical Understanding of Deep Learning
Generative adversarial networks (GANs), NIPS 2018 (Arxiv), CVPR 2018
Simons Institute Interactive Learning Workshop, 2017
Simons Institute Representation Learning Workshop, 2017
Simons Institute Computational Challenges in Machine Learning Workshop, 2017
Yann LeCun, Learning world models: The next step towards AI.
Yoshua Bengio, From deep learning of disentangled representations to higher-level cognition
Joshua Tenenbaum, Building machines that learn & think like people
Michael I. Jordan, SysML 2018: Perspectives and Challenges

Robotics

Pieter Abbeel, Deep learning for robotics, NIPS 2017 Invited Talk (slides, Dec 2018)

Computer Vision

Jitendra Malik, IJCAI 2018 Research Excellence Award talk
Nick Rhinehart, Paul Vernaza, and Kris Kitan, Inverse reinforcement learning for computer vision, CVPR 2018 Tutorial

NLP

Jianfeng Gao, Michel Galley, and Lihong Li, Neural approaches to Conversational AI. ACL 2018 Tutorial.
William Wang, Jiwei Li, and Xiaodong He, Deep reinforcement learning for NLP. ACL 2018 Tutorial.

Fiance & Economics

Sendhil Mullainathan, Machine Learning and Prediction in Economics and Finance, AFA 2017 Lecture

Healthcare

Yan Liu and Jimeng Sun, Deep Learning Models for Health Care — Challenges and Solutions, ICML 2017 Tutorial
Deep Reinforcement Learning for Medical Imaging

Education

Curtis G. Northcutt, Artificial Intelligence in Online Education

Security

Dawn Song, AI and security: Lessons, challenges and future directions

Transportation

Deep Reinforcement Learning with Applications in Transportation, AAAI 2019 Tutorial

Go to Table of Contents

Conferences, Journals and Workshops

NIPS: Neural Information Processing Systems
ICML: International Conference on Machine Learning
ICLR: International Conference on Learning Representation
RLDM: Multidisciplinary Conference on Reinforcement Learning and Decision Making
EWRL: European Workshop on Reinforcement Learning
Deep Reinforcement Learning Workshop, NIPS 2018, 2017 (Symposium), 2016, 2015; IJCAI 2016
AAAI, IJCAI, ACL, EMNLP, NAACL, CVPR, ICCV, ECCV, ICRA, IROS, RSS, SIGDIAL, KDD, SIGIR, WWW, etc.
AI Frontiers Conference
JMLR, MLJ, AIJ, JAIR, TPAMI, etc
Nature Machine Intelligence, Science Robotics
Nature May 2015, Science July 2015, survey papers on machine learning/AI
Science, July 7, 2017 issue, The Cyberscientist, a special issue about AI
http://distill.pub

Go to Table of Contents

Blogs

Deepmind Blog，DeepMind Safety Research
Google Research Blog
The Google Brain Team — Looking Back on 2017(1,2), 2016
Berkeley AI Research Blog
OpenAI Blog, Spinning Up in Deep RL
Facebook AI Research (FAIR) Blog
http://rodneybrooks.com/blog/
Bandit algorithms
David Abel, notes: ICML 2018, AAAI 2018, NIPS 2017
Denny Britz, AI and Deep Learning in 2017 — A Year in Review
Denny Britz, Learning Reinforcement Learning (with Code, Exercises and Solutions)
Andrej Karpathy, Deep Reinforcement Learning: Pong from Pixels
Lilian Weng, A (Long) Peek into Reinforcement Learning
Alexander Irpan, Deep Reinforcement Learning Doesn’t Work Yet (Note: The title is wrong.)
Matthew Rahtz, Lessons Learned Reproducing a Deep Reinforcement Learning Paper
Junling Hu, Reinforcement learning explained — learning to act based on long-term payoffs
Li Deng, How deep reinforcement learning can help chatbots
Deep Learning
Reinforcement Learning

Go to Table of Contents

Benchmarks and Testbeds

I list some RL testbeds in the following. Common testbeds for general RL algorithms are Atari games, e.g., in the Arcade Learning Environment (ALE), for discrete control, and simulated robots, e.g. using MuJoCo in OpenAI Gym, for continuous control.

The Arcade Learning Environment (ALE) is a framework composed of Atari 2600 games to develop and evaluate AI agents.
OpenAI Gym is a toolkit for the development of RL algorithms, consisting of environments, e.g., Atari games and simulated robots, and a site for the comparison and reproduction of results. OpenAI Gym has the following environments: algorithmic, Atari, xox2d, classic control, MuJoCo, robotics, and, toy text.
MuJoCo, Multi-Joint dynamics with Contact, a physics engine.
DeepMind Control Suite
DeepMind Lab, DeepMind first-person 3D game platform
Deepmind PySC2 — StarCraft II Learning Environment
Dopamine, a Tensorflow-based RL framework from Google AI
TRFL: Reinforcement Learning Building Blocks
David Churchill, CommandCenter: StarCraft 2 AI Bot
ELF, an extensive, lightweight and flexible platform for RL research,
ELF OpenGo: A Reimplementation of AlphaGoZero/AlphaZero using ELF.
FAIR TorchCraft is a library for Real-Time Strategy (RTS) games such as StarCraft: Brood War.
FAIR Detectron, for computer vision.
Ray RLlib: A Composable and Scalable Reinforcement Learning Library
ParlAI is a framework for dialogue research, implemented in Python, open-sourced by Facebook.
Natural language decathlon (decaNLP), an NLP benchmark suitable for multitask, transfer, and continual learning.
Project Malmo, from Microsoft, is an AI research and experimentation platform built on top of Minecraft.
Twitter open-sources torch-twrl, a framework for RL development.
ViZDoom is a Doom-based AI research platform for visual RL.
Baidu Apollo Project, self-driving open-source
TORCS is a car racing simulator.
CoQA, a large-scale dataset for building conversational QA systems
WebNav Challenge for Wikipedia links navigation
Psychlab: A Psychology Laboratory for Deep RL Agents
RLGlue is a language-independent software for RL experiments.
RLPy is a value-function-based reinforcement learning framework for education and research.

Go to Table of Contents

Resources for Deep Reinforcement Learning

Table of Contents

Books

Surveys and Reports

Courses

Tutorials and Talks

Conferences, Journals and Workshops

Blogs

Benchmarks and Testbeds

Written by Yuxi Li