Last updated: December 28, 2018
This is a collection of resources for deep reinforcement learning, including the following sections: Books, Surveys and Reports, Courses, Tutorials and Talks, Conferences, Journals and Workshops, Blogs, and, Benchmarks and Testbeds. This blog is very long, with lots of resources. See Table of Contents.
This blog is based on Deep Reinforcement Learning: An Overview. These resources are about reinforcement learning core elements, important mechanisms, and applications, as in the overview, also include topics for deep learning, reinforcement learning, machine learning, and, AI. I compile this blog to complement the above book draft, for flexible updates.
If pick three study materials:
- David Silver, Reinforcement Learning, 2015. Slides. Video.
- Sergey Levine, UC Berkeley CS 294: Deep Reinforcement Learning
- Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd Edition). MIT Press.
Two new ones came out recently:
Pick three survey papers:
- LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature, 521:436–444.
- Jordan, M. I. and Mitchell, T. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245):255–260.
- Littman, M. L. (2015). Reinforcement learning improves behaviour from evaluative feedback. Nature, 521:445–451.
There are excellent invited talks, tutorials, workshops in recent conferences, like NIPS, ICML, ICLR, ACL, CVPR, AAAI, IJCAI, etc. Many of them are not included here.
Books
Reinforcement Learning:
- Sutton, R. S. and Barto, A. G. (2018). Reinforcement Learning: An Introduction (2nd Edition). MIT Press. The definitive and intuitive reinforcement learning book. Accompanying Lectures, Python code.
- Szepesvári, C. (2010). Algorithms for Reinforcement Learning. Morgan & Claypool.
- Bertsekas, D. P. (2019). Reinforcement Learning and Optimal Control (draft). Athena Scientific.
- Bertsekas, D. P. (2012). Dynamic programming and optimal control (Vol. II, 4th Edition: Approximate Dynamic Programming). Athena Scientific.
- Bertsekas, D. P. and Tsitsiklis, J. N. (1996). Neuro-Dynamic Programming. Athena Scientific.
- Powell, W. B. (2011). Approximate Dynamic Programming: Solving the curses of dimensionality (2nd Edition). John Wiley and Sons.
- Wiering, M. and van Otterlo, M., editors (2012). Reinforcement Learning: State-of-the-Art. Springer.
- Puterman, M. L. (2005). Markov decision processes : discrete stochastic dynamic programming. Wiley-Interscience.
- Lattimore, T. and Szepesvári, C. (2018). Bandit Algorithms. Cambridge University Press.
Deep Learning
- Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press.
Machine Learning
- Bishop, C. (2011). Pattern Recognition and Machine Learning. Springer.
- Hastie, T., Tibshirani, R., and Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.
- Murphy, K. P. (2012). Machine Learning: A Probabilistic Perspective. The MIT Press.
- Zhou, Z.-H. (2016). Machine Learning (in Chinese). Tsinghua University Press, Beijing, China.
- Mitchell, T. (1997). Machine Learning. McGraw Hill.
- James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An Introduction to Statistical Learning with Applications in R. Springer.
- Kuhn, M. and Johnson, K. (2013). Applied Predictive Modeling. Springer.
- Provost, F. and Fawcett, T. (2013). Data Science for Business. O’Reilly.
- Simeone, O. (2017). A Brief Introduction to Machine Learning for Engineers. ArXiv.
- Vapnik, V. N. (1998). Statistical Learning Theory. Wiley.
- Haykin, S. (2008). Neural Networks and Learning Machines (third edition). Prentice Hall.
Causality
- Pearl, J. (2009). Causality. Cambridge University Press.
- Pearl, J., Glymour, M., and Jewell, N. P. (2016). Causal Inference in Statistics: A Primer. Wiley.
- Pearl, J. and Mackenzie, D. (2018). The Book of Why: The New Science of Cause and Effect. Basic Books.
- Peters, J., Janzing, D., and Schölkopf, B. (2017). Elements of Causal Inference: Foundations and Learning Algorithms. MIT Press.
Natural Language Processing (NLP)
- Jurafsky, D. and Martin, J. H. (2017). Speech and Language Processing (3rd ed. draft). Prentice Hall.
- Goldberg, Y. (2017). Neural Network Methods for Natural Language Processing. Morgan & Claypool.
- Deng, L. and Liu, Y., editors (2018). Deep Learning in Natural Language Processing. Springer.
Semi-supervised Learning
- Zhu, X. and Goldberg, A. B. (2009). Introduction to semi-supervised learning. Morgan & Claypool.
Learning to learn
- Hutter, F., Kotthoff, L., and Vanschoren, J., editors (2018). Automatic Machine Learning: Methods, Systems, Challenges. Springer. In press, available at http://automl.org/book.
- Chen, Z. and Liu, B. (2016). Lifelong Machine Learning. Morgan & Claypool.
Game Theory
- Leyton-Brown, K. and Shoham, Y. (2008). Essentials of Game Theory: A Concise, Multidisciplinary Introduction. Morgan & Claypool.
Finance
- Hull, J. C., Options, Futures and Other Derivatives, Prentice Hall.
Transportation
- Bazzan, A. L. and Klügl, F. (2014). Introduction to Intelligent Systems in Traffic and Transportation. Morgan & Claypool
Artificial Intelligence
- Russell, S. and Norvig, P. (2009). Artificial Intelligence: A Modern Approach (3rd edition). Pearson.
Go to Table of Contents
Surveys and Reports
Reinforcement Learning
- Littman, M. L. (2015). Reinforcement learning improves behaviour from evaluative feedback. Nature, 521:445–451.
- Kaelbling, L. P., Littman, M. L., and Moore, A. (1996). Reinforcement learning: A survey. JAIR, 4:237–285.
- Li, Y. (2017). Deep Reinforcement Learning: An Overview. ArXiv.
- Levine, S. (2018). Reinforcement Learning and Control as Probabilistic Inference: Tutorial and Review. ArXiv.
- Recht, B. (2018). A Tour of Reinforcement Learning: The View from Continuous Control. ArXiv.
- Geramifard, A., Walsh, T. J., Tellex, S., Chowdhary, G., Roy, N., and How, J. P. (2013). A tutorial on linear function approximators for dynamic programming and reinforcement learning. Foundations and Trends® in Machine Learning, 6(4):375–451.
- Grondman, I., Busoniu, L., Lopes, G. A., and Babuška, R. (2012). A survey of actor-critic reinforcement learning: Standard and natural policy gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(6):1291–1307.
- Roijers, D. M., Vamplew, P., Whiteson, S., and Dazeley, R. (2013). A survey of multi-objective sequential decision-making. JAIR, 48:67–113.
Deep Learning
- LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature, 521:436–444.
- Poggio, T., Mhaskar, H., Rosasco, L., Miranda, B., and Liao, Q. (2017). Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review. International Journal of Automation and Computing, 14(5):503–519.
- Bengio, Y., Courville, A., and Vincent, P. (2013). Representation learning: A review and new perspectives. TPAMI, 35(8):1798–1828.
- Bengio, Y. (2009). Learning deep architectures for AI. Foundations and trends®in Machine Learning, 2(1):1–127.
- Deng, L. and Dong, Y. (2014). Deep learning: Methods and applications. Foundations and Trends® in Signal Processing, 7(3–4):197–387.
- Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural Networks, 61:85–117.
- Wang, H. and Raj, B. (2017). On the Origin of Deep Learning. ArXiv.
- Sze, V., Chen, Y.-H., Yang, T.-J., and Emer, J. (2017). Efficient Processing of Deep Neural Networks: A Tutorial and Survey. ArXiv.
Machine Learning
- Jordan, M. I. and Mitchell, T. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245):255–260.
- Domingos, P. (2012). A few useful things to know about machine learning. Communications of the ACM, 55(10):78–87.
- Bottou, L., Curtis, F. E., and Nocedal, J. (2018). Optimization methods for large-scale machine learning. SIAM Review, 60(2):223–311.
- Ng, A. (2018). Machine Learning Yearning (draft). deeplearning.ai.
- Zinkevich, M. (2017). Rules of Machine Learning: Best Practices for ML Engineering.
- Andrieu, C., de Freitas, N., Doucet, A., and Jordan, M. I. (2003). An introduction to MCMC for machine learning. Machine Learning, 50(1–2):5–43.
Causality
- Pearl, J. (2018). The seven pillars of causal reasoning with reflections on machine learning. UCLA Technical Report R-481.
- Guo, R., Cheng, L., Li, J., Hahn, P. R., and Liu, H. (2018). A Survey of Learning Causality with Data: Problems and Methods. ArXiv e-prints.
Graph Neural Networks
- Battaglia, P. W., Hamrick, J. B., Bapst, V., et al. (2018). Relational inductive biases, deep learning, and graph networks. ArXiv.
- Zhang, Z., Cui, P., and Zhu, W. (2018c). Deep learning on graphs: A survey. ArXiv.
- Zhou, J., Cui, G., Zhang, Z., Yang, C., Liu, Z., and Sun, M. (2018a). Graph neural networks: A review of methods and applications. ArXiv.
Exploration
- Li, L. (2012). Sample complexity bounds of exploration. In Wiering, M. and van Otterlo, M., editors, Reinforcement Learning: State-of-the-Art, pages 175–204. Springer-Verlag Berlin Heidelberg.
Transfer Learning
- Taylor, M. E. and Stone, P. (2009). Transfer learning for reinforcement learning domains: A survey. JMLR, 10:1633–1685.
- Pan, S. J. and Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345–1359.
- Weiss, K., Khoshgoftaar, T. M., and Wang, D. (2016). A survey of transfer learning. Journal of Big Data, 3(9).
Multi-task Learning
- Zhang, Y., , and Yang, Q. (2018). An overview of multi-task learning. National Science Review, 5:30–43.
- Ruder, S. (2017). An Overview of Multi-Task Learning in Deep Neural Networks. ArXiv.
Neural Architecture Search
- Elsken, T., Hendrik Metzen, J., and Hutter, F. (2018). Neural Architecture Search: A Survey. ArXiv.
Learning to Learn
- Chelsea Finn, Learning to Learn with Gradients, PhD thesis, 2018
- Vanschoren, J. (2018). Meta-learning: A survey. ArXiv.
Successor Representation
- Gershman, S. J. (2018). The successor representation: Its computational logic and neural substrates. Journal of Neuroscience, 38(33):7193–7200.
Bayesian RL
- Ghavamzadeh, M., Mannor, S., Pineau, J., and Tamar, A. (2015). Bayesian reinforcement learning: a survey. Foundations and Trends in Machine Learning, 8(5–6):359–483.
Monte Carlo tree search (MCTS)
- Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P. I., Rohlfshagen, P., Tavener, S., Perez, D., Samothrakis, S., and Colton, S. (2012). A survey of Monte Carlo tree search methods. IEEE Transactions on Computational Intelligence and AI in Games, 4(1):1–43.
- Gelly, S., Schoenauer, M., Sebag, M., Teytaud, O., Kocsis, L., Silver, D., and Szepesvári, C. (2012). The grand challenge of computer go: Monte carlo tree search and extensions. Communications of the ACM, 55(3):106–113.
Attention and Memory
- Olah, C. and Carter, S. (2016). Attention and augmented recurrent neural networks. Distill.
- Denny Britz, Attention and Memory in Deep Learning and NLP
Intrinsic Motivation
- Barto, A. (2013). Intrinsic motivation and reinforcement learning. In Baldassarre, G. and Mirolli, M., editors, Intrinsically Motivated Learning in Natural and Artificial Systems. Springer, Berlin, Heidelberg.
- Schmidhuber, J. (2010). Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Transactions on Autonomous Mental Development, 2(3):230–247.
- Oudeyer, P.-Y. and Kaplan, F. (2007). What is intrinsic motivation? a typology of computational approaches. Frontiers in neurorobotics, 1(6).
Evolution Strategy
- Hansen, N. (2016). The CMA Evolution Strategy: A Tutorial. ArXiv.
Robotics
- Kober, J., Bagnell, J. A., and Peters, J. (2013). Reinforcement learning in robotics: A survey. International Journal of Robotics Research, 32(11):1238–1278.
- Deisenroth, M. P., Neumann, G., and Peters, J. (2013). A survey on policy search for robotics. Foundations and Trend in Robotics, 2:1–142.
- Argall, B. D., Chernova, S., Veloso, M., and Browning, B. (2009). A survey of robot learning from demonstration.Robotics and Autonomous Systems, 57(5):469–483.
Natural Language Processing (NLP)
- Hirschberg, J. and Manning, C. D. (2015). Advances in natural language processing. Science, 349(6245):261–266.
- Cho, K. (2015). Natural Language Understanding with Distributed Representation. ArXiv.
- Young, T., Hazarika, D., Poria, S., and Cambria, E. (2017). Recent Trends in Deep Learning Based Natural Language Processing. ArXiv.
Dialogue Systems
- Hinton, G., Deng, L., Yu, D., Dahl, G. E., rahman Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., , and Kingsbury, B. (2012). Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine, 82.
- Deng, L. and Li, X. (2013). Machine learning paradigms for speech recognition: An overview. IEEE Transac- tions on Audio, Speech, and Language Processing, 21(5):1060–1089.
- Gao, J., Galley, M., and Li, L. (2018). Neural approaches to Conversational AI. Foundations and Trends in Information Retrieval. To appear.
- He, X. and Deng, L. (2013). Speech-centric information processing: An optimization-oriented approach. Proceedings of the IEEE | Vol. 101, №5, May 2013, 101(5):1116–1135.
- Young, S., Gašić, M., Thomson, B., and Williams, J. D. (2013). POMDP-based statistical spoken dialogue systems: a review. Proceedings of IEEE, 101(5):1160–1179.
Computer Vision
- Zhang, Q. and Zhu, S.-C. (2018). Visual interpretability for deep learning: a survey. Frontiers of Information Technology & Electronic Engineering, 19(1):27–39.
- Bohg, J., Hausman, K., Sankaran, B., Brock, O., Kragic, D., Schaal, S., and Sukhatme, G. S. (2017). Interactive perception: Leveraging action in perception and perception in action. IEEE Transactions on Robotics, 33(6):1273–1291.
Recommender System
- Zhang, S., Yao, L., Sun, A., and Tay, Y. (2017). Deep Learning based Recommender System: A Survey and New Perspectives. ArXiv e-prints.
Healthcare
- Chakraborty, B. and Murphy, S. A. (2014). Dynamic treatment regimes. Annual Review of Statistics and Its Application, 1:447–464.
Energy
- Anderson, R. N., Boulanger, A., Powell, W. B., and Scott, W. (2011). Adaptive stochastic control for the smart grid. Proceedings of the IEEE, 99(6):1098–1115.
Collection of Applications
- Yuxi Li, Reinforcement Learning Applications
- Satinder Singh, Successes of Reinforcement Learning
- Csaba Szepesvári, RLApplications.bib
AI Safety
- Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., and Mané, D. (2016). Concrete Problems in AI Safety. ArXiv.
- Garcìa, J. and Fernàndez, F. (2015). A comprehensive survey on safe reinforcement learning. JMLR, 16:1437–1480.
Go to Table of Contents
Courses
Reinforcement Learning
- David Silver, Reinforcement Learning, 2015. Slides. Video.
- Sergey Levine, UC Berkeley CS 294: Deep Reinforcement Learning
- Richard Sutton, Reinforcement Learning, 2016.
- Katerina Fragkiadaki, Ruslan Satakhutdinov, Deep Reinforcement Learning and Control, Spring 2017
- Emma Brunskill, CS234: Reinforcement Learning
- Charles Isbell, Michael Littman and Chris Pryby, Udacity: Reinforcement Learning
- Emo Todorov, Intelligent control through learning and optimization
- OpenAI Spinning Up in Deep RL
- Deep Reinforcement Learning Hands-On
Deep Learning
- Andrew Ng and Kian Katanforoosh, Stanford CS230: Deep Learning
- Andrew Ng, Deep Learning Specialization
- Jeremy Howard, Practical Deep Learning For Coders
- Nando de Freitas, Deep Learning Lectures
- David Donoho, Hatef Monajemi, and Vardan Papyan, Stanford STATS 385, Theories of Deep Learning
Machine Learning
- Andrew Ng, Machine Learning
Robotics
- Pieter Abbeel, Advanced Robotics, Fall 2015
- Abdeslam Boularias, Robot Learning Seminar
- MIT 6.S094: Deep Learning for Self-Driving Cars
Computer Vision
- Fei-Fei Li, Justin Johnson, and Serena Yeung, CS231n: Convolutional Neural Networks for Visual Recognition
NLP
- Richard Socher, CS224d: Deep Learning for Natural Language Processing
- Brendan Shillingford, Yannis Assael, Chris Dyer, Oxford Deep NLP 2017 course
Healthcare
- David Sontag, Machine Learning for Healthcare
AI
- UC Berkeley CS188 Intro to AI
- Andrew Critch and Stuart Russell, UC Berkeley CS 294–149: Safety and Control for Artificial General Intelligence
Go to Table of Contents
Tutorials and Talks
Reinforcement Learning
- Rich Sutton, Introduction to Reinforcement Learning with Function Approximation
- Rich Sutton, Temporal Difference Learning
- Andrew Barto, A history of reinforcement learning
- Deep Reinforcement Learning, David Silver, Pieter Abbeel, Sergey Levine and Chelsea Finn
- David Silver, Principles of Deep RL
- Benjamin Recht, Optimization Perspectives on Learning to Control
- John Schulman, The Nuts and Bolts of Deep Reinforcement Learning Research
- Joelle Pineau, Introduction to Reinforcement Learning
- Deep Learning and Reinforcement Learning Summer School, 2018, 2017
- Deep Learning Summer School, 2016, 2015
- Yisong Yue and Hoang M. Le, Imitation Learning, ICML 2018 Tutorial
Deep Learning
- Andrew Ng, Nuts and Bolts of Building Applications using Deep Learning
- Christopher Manning and Russ Salakhutdinov, Introductory Overview Lecture The Deep Learning Revolution, JSM 2108 Tutorial
- Sanjeev Arora, ICML 2018 Tutorial on Toward Theoretical Understanding of Deep Learning
- Generative adversarial networks (GANs), NIPS 2018 (Arxiv), CVPR 2018
- Simons Institute Interactive Learning Workshop, 2017
- Simons Institute Representation Learning Workshop, 2017
- Simons Institute Computational Challenges in Machine Learning Workshop, 2017
- Yann LeCun, Learning world models: The next step towards AI.
- Yoshua Bengio, From deep learning of disentangled representations to higher-level cognition
- Joshua Tenenbaum, Building machines that learn & think like people
- Michael I. Jordan, SysML 2018: Perspectives and Challenges
Robotics
- Pieter Abbeel, Deep learning for robotics, NIPS 2017 Invited Talk (slides, Dec 2018)
Computer Vision
- Jitendra Malik, IJCAI 2018 Research Excellence Award talk
- Nick Rhinehart, Paul Vernaza, and Kris Kitan, Inverse reinforcement learning for computer vision, CVPR 2018 Tutorial
NLP
- Jianfeng Gao, Michel Galley, and Lihong Li, Neural approaches to Conversational AI. ACL 2018 Tutorial.
- William Wang, Jiwei Li, and Xiaodong He, Deep reinforcement learning for NLP. ACL 2018 Tutorial.
Fiance & Economics
- Sendhil Mullainathan, Machine Learning and Prediction in Economics and Finance, AFA 2017 Lecture
Healthcare
- Yan Liu and Jimeng Sun, Deep Learning Models for Health Care — Challenges and Solutions, ICML 2017 Tutorial
- Deep Reinforcement Learning for Medical Imaging
Education
- Curtis G. Northcutt, Artificial Intelligence in Online Education
Security
Transportation
- Deep Reinforcement Learning with Applications in Transportation, AAAI 2019 Tutorial
Go to Table of Contents
Conferences, Journals and Workshops
- NIPS: Neural Information Processing Systems
- ICML: International Conference on Machine Learning
- ICLR: International Conference on Learning Representation
- RLDM: Multidisciplinary Conference on Reinforcement Learning and Decision Making
- EWRL: European Workshop on Reinforcement Learning
- Deep Reinforcement Learning Workshop, NIPS 2018, 2017 (Symposium), 2016, 2015; IJCAI 2016
- AAAI, IJCAI, ACL, EMNLP, NAACL, CVPR, ICCV, ECCV, ICRA, IROS, RSS, SIGDIAL, KDD, SIGIR, WWW, etc.
- AI Frontiers Conference
- JMLR, MLJ, AIJ, JAIR, TPAMI, etc
- Nature Machine Intelligence, Science Robotics
- Nature May 2015, Science July 2015, survey papers on machine learning/AI
- Science, July 7, 2017 issue, The Cyberscientist, a special issue about AI
- http://distill.pub
Go to Table of Contents
Blogs
- Deepmind Blog,DeepMind Safety Research
- Google Research Blog
- The Google Brain Team — Looking Back on 2017(1,2), 2016
- Berkeley AI Research Blog
- OpenAI Blog, Spinning Up in Deep RL
- Facebook AI Research (FAIR) Blog
- http://rodneybrooks.com/blog/
- Bandit algorithms
- David Abel, notes: ICML 2018, AAAI 2018, NIPS 2017
- Denny Britz, AI and Deep Learning in 2017 — A Year in Review
- Denny Britz, Learning Reinforcement Learning (with Code, Exercises and Solutions)
- Andrej Karpathy, Deep Reinforcement Learning: Pong from Pixels
- Lilian Weng, A (Long) Peek into Reinforcement Learning
- Alexander Irpan, Deep Reinforcement Learning Doesn’t Work Yet (Note: The title is wrong.)
- Matthew Rahtz, Lessons Learned Reproducing a Deep Reinforcement Learning Paper
- Junling Hu, Reinforcement learning explained — learning to act based on long-term payoffs
- Li Deng, How deep reinforcement learning can help chatbots
- Deep Learning
- Reinforcement Learning
Go to Table of Contents
Benchmarks and Testbeds
I list some RL testbeds in the following. Common testbeds for general RL algorithms are Atari games, e.g., in the Arcade Learning Environment (ALE), for discrete control, and simulated robots, e.g. using MuJoCo in OpenAI Gym, for continuous control.
- The Arcade Learning Environment (ALE) is a framework composed of Atari 2600 games to develop and evaluate AI agents.
- OpenAI Gym is a toolkit for the development of RL algorithms, consisting of environments, e.g., Atari games and simulated robots, and a site for the comparison and reproduction of results. OpenAI Gym has the following environments: algorithmic, Atari, xox2d, classic control, MuJoCo, robotics, and, toy text.
- MuJoCo, Multi-Joint dynamics with Contact, a physics engine.
- DeepMind Control Suite
- DeepMind Lab, DeepMind first-person 3D game platform
- Deepmind PySC2 — StarCraft II Learning Environment
- Dopamine, a Tensorflow-based RL framework from Google AI
- TRFL: Reinforcement Learning Building Blocks
- David Churchill, CommandCenter: StarCraft 2 AI Bot
- ELF, an extensive, lightweight and flexible platform for RL research,
ELF OpenGo: A Reimplementation of AlphaGoZero/AlphaZero using ELF. - FAIR TorchCraft is a library for Real-Time Strategy (RTS) games such as StarCraft: Brood War.
- FAIR Detectron, for computer vision.
- Ray RLlib: A Composable and Scalable Reinforcement Learning Library
- ParlAI is a framework for dialogue research, implemented in Python, open-sourced by Facebook.
- Natural language decathlon (decaNLP), an NLP benchmark suitable for multitask, transfer, and continual learning.
- Project Malmo, from Microsoft, is an AI research and experimentation platform built on top of Minecraft.
- Twitter open-sources torch-twrl, a framework for RL development.
- ViZDoom is a Doom-based AI research platform for visual RL.
- Baidu Apollo Project, self-driving open-source
- TORCS is a car racing simulator.
- CoQA, a large-scale dataset for building conversational QA systems
- WebNav Challenge for Wikipedia links navigation
- Psychlab: A Psychology Laboratory for Deep RL Agents
- RLGlue is a language-independent software for RL experiments.
- RLPy is a value-function-based reinforcement learning framework for education and research.
Go to Table of Contents