Machine Learning Algorithms

Shubhangi Hora
6 min readSep 27, 2018

--

“landscape photography of brown mountain across water” by Kym Ellis on Unsplash

Just a quick refresher from the previous article machine learning is the process through which a machine learns to perform a specific task without being explicitly programmed to. There are various algorithms that perform this task and they fall under the following categories:

1. Supervised Learning

These type of algorithms work with data that contains inputs as well as labelled outputs — hence the term “supervised”. The inputs are called the features and the labelled outputs is known as the target.

For example, I give you two labelled coins with their respective mass and diameter, as follows

mass — — — — diameter — — — — label

3.8g — — — — 25mm — — — — —1 rupee

6.0g — — — — 23mm — — — — — 5 rupee

Then I give you a third coin that doesn’t have a label, but has a mass of 3.8g and a diameter of 25mm. You can look at the above table of data and tell me that this third unlabeled coin is a 1 rupee coin. So, you have now learnt how to take unlabeled coins and determine whether they are 1 rupee coins or 5 rupee coins. This is what Supervised Learning is.

In the case of a machine, it would be fed with millions of instances of the mass and diameter of these two coins with their corresponding labels, so that ultimately when it is given an unlabeled coin, it can accurately match the coin’s features with its label.

The graph shown above is an example of what would be seen with a Supervised Learning algorithm — distinct groups of data points.

2. Unsupervised Learning

This type of learning involves data consisting of inputs without labelled outputs. The inputs are called the features and there is no target label.

Using the same coin example: I give you two unlabeled coins with their respective mass and diameter, as follows

mass — — — — diameter

3.8g — — — — 25mm

6.0g — — — — 23mm

You can’t tell the label of the coin, but because both coins are of a different size and mass, you can understand that they are both different. Hence, you could probably create two separate classes — A and B — and say that the first coin is from class A and the second coin is from class B. So if I give you a third coin of mass 6.0g and diameter 23mm, you can accurately say that this third coin belongs to class B.

This is Unsupervised Learning. In the case of a machine, it would be fed with millions of instances of the mass and diameter of these two coins, so that ultimately when it is given a coin after the learning has occurred, it can accurately match the coin’s features with the class. The machine doesn’t know what the target label is, but it is able to find patterns and form different classes / clusters to segregate the data.

Click here to read more: https://machinelearningmastery.com/supervised-and-unsupervised-machine-learning-algorithms/

3. Reinforcement Learning

Let’s say you want to train your dog to sit when you say the word “sit”. How will you go about doing this? Most probably, every time you say “sit” and she sits, you’ll reward her with a treat. After this process has been repeated a couple of times, your dog will associate the action of sitting upon hearing the word “sit” with receiving a treat, and so she will learn the behavior. This is what reinforcement learning is — the process of learning a particular task or action by receiving rewards.

Machines can learn how to perform particular tasks in the same manner. In this case, there is no physical reward, instead there is a reward function that reinforces and trains a program to learn a particular action and perform a task.

The path of Reinforcement Learning is usually chosen when large amounts of data pertaining to the task do not exist. So since you can’t learn too much from existing information, you just have to make your own mistakes and learn from them, like life.

Let’s take the example of automating the process of finding the shortest path using Reinforcement Learning.

source: http://homes.sice.indiana.edu/classes/spring2016/csci/c343-yye/shortest.php

The circles numbered 1–6 are nodes and the lines connecting them are edges. Each time you choose an edge to move to a circle, the number along that edge gets added to your total distance. The aim of the game is to reach from node 1 to node 5 by traveling the least distance.

A few keywords to keep in mind are –

· State — this is the current state that you are in, for example the start state will be circle 1 with the paths as shown.

· Action — this is choosing which path to take to go to which circle.

· Reward — let’s say you win this game if you can make it from node 1 to 5 with a total distance less than 21. so if you win the game, you get a +1 score, and if you lose you get a -1 score.

So now, what you have to do is choose which path to take given your current state. There is no previous data for you to learn from; you just have to choose a path at random and see if it leads to you a +1 or a -1. This is what will encourage or discourage you to repeat that action of choosing that path again.

Let’s say these instructions are fed into a program. How will it learn? We make it play this game 100 times, where it randomly chooses paths to go from node 1 to node 5. Each time it has a total distance less than 21, we reward it with +1.

Let’s say that each game has 3 actions, so there are 3 * 100 actions in total, and that the program won 20 games and lost 80 games.

So there are 3 * 20 winning actions and 3 * 80 losing actions.

The program is discouraged to play any of the losing actions in the future since it was not rewarded for them, and so the next time it finds itself in a state, it is not likely to use that action, and instead it will use a winning action since it has been encouraged to play that.

But how does it discourage or encourage? A weight (value) is assigned to each action at the beginning of the games, and after each game is won or lost, the corresponding winning or losing action’s weight is updated. Depending on the updated weight, the program chooses whether to play that move or not. That’s what Reinforcement Learning is.

Inspired by a quora answer: https://www.quora.com/Whats-the-difference-between-reinforcement-Learning-and-Deep-learning

Next, I’ll be starting a series on Supervised Learning Algorithms! Please let me know what you thought of this article and how I can improve it in the comments below :)

--

--

Shubhangi Hora

A python developer working on AI and ML, with a background in Computer Science and Psychology. Interested in healthcare AI, specifically mental health!