Simple Reinforcement Learning with Tensorflow Part 7: Action-Selection Strategies for Exploration
Arthur Juliani


I see your input in the tutorial is a set of 84*84*3(=>6 actions) data, how could i change it so that my input of 18*18*2 (=> 140) actions data can fits it? Do you choose your parameter by experience?

Also how would you deal with the argmax from the network if the argmax action turns out is an invalid action, as if some chess may have such rule.

Lastly, do you have some insight for building a chess engine?

Thank you for your time.

(i know i asked alot, sorry about that)