Hello Arthur,
I’ve been following your work with reinforcement algorithms closely and I have been trying to do a combination of RL and GA in trying to beat Mario. This was inspired by your article.
What I do is have an organism from a GA that plays the game for a certain period of time. When that time period is up another organism picks up from the same spot the other one left off at.
The RL aspect is used to determine which organism to play next. This is done by not only taking in the image of the current screen but also the set of input/output mappings of each organism which are passed in through a RNN.
This code can beat several levels but isn’t any better than having the organisms be chosen randomly.
Do you have any tips on how to alter the code to make it work better in this situation?
A better explanation and the release version of the code is here:
I can answer more questions if you contact me.