How To Implement Find-S Algorithm In Machine Learning?
In Machine Learning, concept learning can be termed as “a problem of searching through a predefined space of potential hypothesis for the hypothesis that best fits the training examples” — Tom Mitchell. In this article, we will go through one such concept learning algorithm known as the Find-S algorithm. The following topics are discussed in this article.
- What is Find-S Algorithm in Machine Learning?
- How Does it Work?
- Limitations of Find-S Algorithm
- Implementation of Find-S Algorithm
- Use Case
What is Find-S Algorithm in Machine Learning?
In order to understand Find-S algorithm, you need to have a basic idea of the following concepts as well:
- Concept Learning
- General Hypothesis
- Specific Hypothesis
1. Concept Learning
Let’s try to understand concept learning with a real-life example. Most of the human learning is based on past instances or experiences. For example, we are able to identify any type of vehicle-based on a certain set of features like make, model, etc., that are defined over a large set of features.
These special features differentiate the set of cars, trucks, etc from the larger set of vehicles. These features that define the set of cars, trucks, etc are known as concepts.
Similar to this, machines can also learn from concepts to identify whether an object belongs to a specific category or not. Any algorithm that supports concept learning requires the following:
- Training Data
- Target Concept
- Actual Data Objects
2. General Hypothesis
Hypothesis, in general, is an explanation for something. The general hypothesis basically states the general relationship between the major variables. For example, a general hypothesis for ordering food would be I want a burger.
G = { ‘?’, ‘?’, ‘?’, …..’?’}
3. Specific Hypothesis
The specific hypothesis fills in all the important details about the variables given in the general hypothesis. The more specific details into the example given above would be I want a cheeseburger with a chicken pepperoni filling with a lot of lettuce.
S = {‘Φ’,’ Φ’,’ Φ’, ……,’Φ’}
Now, let’s talk about the Find-S Algorithm in Machine Learning.
The Find-S algorithm follows the steps written below:
- Initialize ‘h’ to the most specific hypothesis.
- The Find-S algorithm only considers the positive examples and eliminates negative examples. For each positive example, the algorithm checks for each attribute in the example. If the attribute value is the same as the hypothesis value, the algorithm moves on without any changes. But if the attribute value is different than the hypothesis value, the algorithm changes it to ‘?’.
Now that we are done with the basic explanation of the Find-S algorithm, let us take a look at how it works
How Does It Work?
- The process starts with initializing ‘h’ with the most specific hypothesis, generally, it is the first positive example in the data set.
- We check for each positive example. If the example is negative, we will move on to the next example but if it is a positive example we will consider it for the next step.
- We will check if each attribute in the example is equal to the hypothesis value.
- If the value matches, then no changes are made.
- If the value does not match, the value is changed to ‘?’.
- We do this until we reach the last positive example in the data set.
Limitations of Find-S Algorithm
There are a few limitations of the Find-S algorithm listed down below:
- There is no way to determine if the hypothesis is consistent throughout the data.
- Inconsistent training sets can actually mislead the Find-S algorithm since it ignores the negative examples.
- The find-S algorithm does not provide a backtracking technique to determine the best possible changes that could be done to improve the resulting hypothesis.
Now that we are aware of the limitations of the Find-S algorithm, let us take a look at a practical implementation of the Find-S Algorithm.
Implementation of Find-S Algorithm
To understand the implementation, let us try to implement it to a smaller data set with a bunch of examples to decide if a person wants to go for a walk.
The concept of this particular problem will be on what days does a person likes to go on a walk.
Looking at the data set, we have six attributes and a final attribute that defines the positive or negative example. In this case, yes is a positive example, which means the person will go for a walk.
So now, the general hypothesis is:
h0 = {‘Morning’, ‘Sunny’, ‘Warm’, ‘Yes’, ‘Mild’, ‘Strong’}
This is our general hypothesis, and now we will consider each example one by one, but only the positive examples.
h1= {‘Morning’, ‘Sunny’, ‘?’, ‘Yes’, ‘?’, ‘?’}
h2 = {‘?’, ‘Sunny’, ‘?’, ‘Yes’, ‘?’, ‘?’}
We replaced all the different values in the general hypothesis to get a resultant hypothesis. Now that we know how the Find-S algorithm works, let us take a look at an implementation using Python.
Use Case
Let’s try to implement the above example using Python. The code to implement the Find-S algorithm using the above data is given below.
import pandas as pd
import numpy as np
#to read the data in the csv file
data = pd.read_csv("data.csv")
print(data,"n")
#making an array of all the attributes
d = np.array(data)[:,:-1]
print("n The attributes are: ",d)
#segragating the target that has positive and negative examples
target = np.array(data)[:,-1]
print("n The target is: ",target)
#training function to implement find-s algorithm
def train(c,t):
for i, val in enumerate(t):
if val == "Yes":
specific_hypothesis = c[i].copy()
break
for i, val in enumerate(c):
if t[i] == "Yes":
for x in range(len(specific_hypothesis)):
if val[x] != specific_hypothesis[x]:
specific_hypothesis[x] = '?'
else:
pass
return specific_hypothesis
#obtaining the final hypothesis
print("n The final hypothesis is:",train(d,target))
Output:
This brings us to the end of this article where we have learned the Find-S Algorithm in Machine Learning with its implementation and use case. I hope you are clear with all that has been shared with you in this tutorial.
If you wish to check out more articles on the market’s most trending technologies like Artificial Intelligence, Python, Ethical Hacking, then you can refer to Edureka’s official site.
Do look out for other articles in this series which will explain the various other aspects of Data Science.
2.Math And Statistics For Data Science
9.Introduction To Machine Learning
12.How To Create A Perfect Decision Tree?
13.Top 10 Myths Regarding Data Scientists Roles
15.Data Analyst vs Data Engineer vs Data Scientist
16.Types Of Artificial Intelligence
17.R vs Python
18.Artificial Intelligence vs Machine Learning vs Deep Learning
20.Data Analyst Interview Questions And Answers
21.Data Science And Machine Learning Tools For Non-Programmers
22.Top 10 Machine Learning Frameworks
23.Statistics for Machine Learning
25.Breadth-First Search Algorithm
26.Linear Discriminant Analysis in R
27.Prerequisites for Machine Learning
28.Interactive WebApps using R Shiny
29.Top 10 Books for Machine Learning
Originally published at https://www.edureka.co