During this year’s edition of SOCML 2017, which I have been lucky to attend, I am participating in a few hour long discussion sessions. Below are some of my personal musings around those. These do not necessarily reflect the points discussed during each session nor do they accurately cover the opinions of others.
Session (morning): Interpretability
The crux of the matter here is to understand why a model does what it does when asked or potentially decides for itself that it has to do something. In my personal opinion, this is a particularly ill-posed problem.
Let’s take for instance human, unless the reader is not one, brains. Our decisions/actions are driven by feelings, which one can interpret as motivation or goals. These can be both short or long sighted. You feel strongly about something, thereby create a goal for yourself. In parts of the world where you have time to think — I luckily live in Norway, and I do — the goals tend to be long sighted.
Every such goal drives one’s behaviour. This should imply that a behaviour can be explained by goals. But, there is a problem here. It is possible to arrive at the same goal state via different paths, each requiring potentially different behaviours driven by various sub-goals. Furthermore, behaviours tend to overlap across different goals.
So, simply looking at the goals is not enough to understand behaviour, even if the goal gets achieved — disregarding asking the achiever to explain their behaviour in natural language, which potentially adds further variation to this understanding. Perhaps then, one has to probe the brain during the act. I do not think we have a good way of doing this with human brains, thereby making the problem ill-posed. But, we do have a better handle on the machine learning models we build, because we build them, thereby letting us probe them e.g. probing individual artificial neurons for what activates them.
Following the dance between goals and catching models in their act makes the problem less ill-posed, and brings fair motivation to working on the problem of interpretability.
Session (early afternoon): Transfer learning/Domain adaptation/Few-shot learning
My jet lag was kicking in at this point. But, it turns out that researchers working within these domains need a naming convention. Aside from that, learning reusable skills does not hurt, unless old skills become useless. Furthermore, it also makes sense to continuously learn new skills which either augment old skills or are completely novel, while making sure reusable skills aren’t forgotten. By the way, this is a common sense description, and thus I expect researchers driving the field to stay calm and not worry about what I mean by skill in a mathematical sense.
Session (late afternoon): Exploration in RL
The jet lag was in full bloom by now, but I followed this discussion better than the others, since my day job involves all things RL.
Clever realisations of common sense ideas like novelty, surprise, curiosity, safety, use of/guidance from old knowledge/skills, etc. have enthused this research. All these and more areas were touched upon.
The discussion revolved around a couple of very interesting questions posed by the moderator of our discussion group, Milan Cvitkovic:
- Is reward or indeed related ideas of intrinsic/extrinsic motivation all that we have to drive research on exploration? Shaping the reward signal to guide exploration is indeed a heavily explored arena with RL. See what I did there?
- What could be good ways/metrics for evaluating exploration techniques? How does one technique differ from another? Perhaps visualising policies as they evolve and skills as they get discovered will help. How the distributions of returns for actions evolve, say compared to those resulting using baseline policies e.g. from human demonstrations, may also help. The idea here would be that an efficient exploration technique may deliver policies with return distributions similar to baseline distributions faster (compared to a less efficient one), provided human demonstrations are good baselines of course.
Apart from these sessions, there were some excellent posters, many of which involved GANs, and at least one involved wormholes to past/future memories.
And, I am not quite sure why I am the only one with a jet lag at this conference. Well, I am awake at 2 am writing this note for a reason! It is just not possible to sleep any longer this “morning”!
Day 2 begins shortly, and if I am again awake at 2 am tomorrow, there might just be another post.