Jul 27, 2017 · 1 min read
I don’t understand how does it select the first move. The paper states that the first action from the root node is selected from the tree policy. So I am wondering if the random rollout is played from the root node also.
I don’t understand how does it select the first move. The paper states that the first action from the root node is selected from the tree policy. So I am wondering if the random rollout is played from the root node also.