Reinforcement Learning

Dylan
2 min readMar 19, 2022

--

Reinforcement Learning is kind of deep learning algorithm, which is widely used in gaming area. When it is hard to label the training data, however we know what is a good action with the environment, we could apply Reinforcement Learning to solve this kind of problems.

the main component of Reinforcement Learning(actor environment action observation and reward)

action = f(observation)

we hope we could find a function or policy which could maximize the expected sum of reward.

steps of Reinforcement Learning

we could use cnn,rnn,or transformer to train the network

Loss:-total reward

max total reward = min -total reward

find θ to get optimize loss

this rl task is similar with gan, but for gan the discriminator is neronetwork, however env and reward in rl is blackbox even with randomness. The same part is that they all want to find best θ to max the target.

we want our actor faceing s1 situation take a1, s2 situation do not take a2.

--

--

Dylan

Software Engineer | Trader | Hong Kong Baptist University DS&AI Msc