AI paradigms

ANASai
4 min readMay 31, 2022

--

We can split “Artificial Intelligence” into three main different paradigms: supervised, unsupervised and reinforcement learning. This division is made based on the type of data we are trying to comprehend for a posterior information extraction. What type of information can we extract from these three paradigms? Here is the shortest and clearer answer:

  • Supervised -> predictive analysis
  • Unsupervised -> descriptive analysis
  • Reinforcement learning -> prescriptive analysis

Supervised Learning

The supervised learning paradigm is based on the previous knowledge of the environment and the desired outputs. Once we have trained a supervised model, the goal is to obtain predictions in data related to the model.

How do we train a supervised model? To train a supervised model we just need labeled data (some rows with features and target columns). Technically, we can use the python libraries we exposed in [1].

We will continue with an example about Ryan Reynolds and cacti:

  1. Let’s suppose we have a database full of Ryan Reynolds and Cacti images (raw data).
  2. We pre-process the images in order to provide data with valuable information to our model, separated by train set (labeled data) and test set (not labeled data).
  3. An AI algorithm process is applied to the train set from step 3.
  4. The same algorithm learnt in step 3 is applied to the images from the test set. In this step we are trying to predict what every image is. Is it a cactus or is it Ryan Reynolds?
Fig. 2: supervised learning algorithm flowchart example [2]

Unsupervised

What’s the main difference from supervised learning? Labels. The unsupervised learning is used for datasets whose items are not labeled and, therefore, our goal now is just to learn some patterns in the dataset.

How do we train an unsupervised model? Since we are not trying to predict which label corresponds to each image anymore, but instead we are defining the different labels we have in our dataset, this is how we do it:

Using the same example as in supervised learning section, the cacti and Ryan Reynolds are now provided to the model without any label nor train or test split. The fact that the data are not labeled, “Unsupervised paradigm” means that we have no test set. If we have no test set we need to validate our model based on other methods like cluster cohesion.

Once our model has been trained, we can test it and end up with two clusters. One cluster is full of cacti and the other one is full of Ryan Reynolds. The thing here is that we don’t know what the labels would be and, for us they are only “Cluster 1” and “Cluster 2”.

Fig. 2: unsupervised learning algorithm flowchart example [2]

Reinforcement Learning

Reinforcement learning algorithm is composed of 6 main different elements:

  • The environment contains every object to take into account during the algorithm
  • The agent is an object capable of making decisions over the environment
  • The agent can take actions accordingly to the policy that modifies its state
  • The policy defines how the agent should behave
  • The observation is a particular environment state during the algorithm
  • The actions taken by the agent lead to rewards whenever some conditions are met in the environment

Find below a quick story about Ryan Reynolds and the Cactus.

Stage 1:

Ryan Reynolds finds himself in between a cactus and a lollipop. He has two possible actions: either to go left or to go right. He doesn’t know if the cactus or the lollipop are any good for him and in order to find out, Ryan needs to take a decision.

Fig. 3: reinforcement learning example stage 1 [3][4]

Stage 2

Ryan Reynolds decides to move left and go for the cactus. In spite of how good the cactus he found to his left looked like, Ryan Reynolds received a hurtful prick.

Fig. 4: reinforcement learning example stage 2 [3][4]

Stage 3

We come back to the first state of the environment. Now, Ryan Reynolds has learnt from the previous decision and has modified his policy. After a deep analysis of the situation, Ryan Reynolds decides to go for the lollipop this time.

Fig. 5: reinforcement learning example stage 3 [3][4]

Stage 4

Hooray! The lollipop was the right decision and now Ryan Reynolds is very happy. Next time he finds himself between a cactus and a lollipop he will know what to do.

Fig. 6: reinforcement learning example stage 4 [3][4]

Post by Gonzalo Zabala García | Edited by Bea Matt

--

--

ANASai

Working on the artificial intelligence community developing better understanding of current practices and technologies