Chris Vigorito
Nov 2 · 1 min read

Nice post! Have you looked at Universal Option Models as well? They also seek to decouple reward function from transition dynamics, but in the context of learning/representing option models in hierarchical RL. It’s been a few years since I’ve worked on this stuff, but I built upon that work to explore some ways to do intrinsically motivated exploration and skill hierarchy learning in Chapter 3 of my PhD thesis. Looking forward to seeing your work.