Self-Organizing Conference on Machine Learning 2017 — DAY 1

Session (morning): Interpretability

Session (early afternoon): Transfer learning/Domain adaptation/Few-shot learning

Session (late afternoon): Exploration in RL

  • Is reward or indeed related ideas of intrinsic/extrinsic motivation all that we have to drive research on exploration? Shaping the reward signal to guide exploration is indeed a heavily explored arena with RL. See what I did there?
  • What could be good ways/metrics for evaluating exploration techniques? How does one technique differ from another? Perhaps visualising policies as they evolve and skills as they get discovered will help. How the distributions of returns for actions evolve, say compared to those resulting using baseline policies e.g. from human demonstrations, may also help. The idea here would be that an efficient exploration technique may deliver policies with return distributions similar to baseline distributions faster (compared to a less efficient one), provided human demonstrations are good baselines of course.





Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Arjun Chandra

Arjun Chandra

More from Medium

My take on AI/ML course with Upgrad and IIIT-B

Beyond pdM — A Continuously Learning Factory

Mike’s Data Science Tutorial For Chemists: Automating Code to Calculate Distance From Chemical…