Understanding the idea behind Q-Learning

Published in

DataSeries

4 min readJul 28, 2019

Reinforcement Learning is a subfield of Machine Learning whose tasks differ from ‘standard’ ways of learning. Indeed, rather than being provided with historical data and make predictions or inferences on them, you want your reinforcement algorithm to learn, from scratch, from the surrounding environment. Basically, you want it to behave as you would have done in a similar situation (if you want to learn more about the structure of RL, click here to read my former article).

There are many scenarios where Reinforcement Learning offers a more suitable solution than Supervised or Unsupervised Learning. Namely, think about videogames: imagine you want to develop a general AI solution which is able to strategically play one game, let’s say Flappy Bird. You might be following two approaches:

Pattern recognition approach: you can collect all historical data of the best players of Flappy Birds, and let your algorithm learn from them. By doing so, you will build an algorithm which, whenever a scenario previously faced by a player occurs, is able to replicate that player’s strategy.
Reinforcement learning: in this case, you do not need any former data. On the contrary, you want your algorithm to start learning from zero, so that it can learn by attempts, without being biased by previous patterns.

If you consider that a game environment is stochastic (it won’t replicate exactly its previous states without any changes), you can easily understand how the first approach will fail…

Understanding the idea behind Q-Learning

Written by Valentina Alto