Simple Reinforcement Learning with Tensorflow Part 7: Action-Selection Strategies for Exploration
Arthur Juliani
2599

Hi,

I see your input in the tutorial is a set of 84*84*3(=>6 actions) data, how could i change it so that my input of 18*18*2 (=> 140) actions data can fits it? Do you choose your parameter by experience?

Also how would you deal with the argmax from the network if the argmax action turns out is an invalid action, as if some chess may have such rule.

Lastly, do you have some insight for building a chess engine?

Thank you for your time.

(i know i asked alot, sorry about that)

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.