Decision Tree in Machine Learning
Hey, we will dive into decision trees in machine learning, it will be super fun like our family tree. Let’s get started,
What is a Decision Tree?
A Decision Tree is a non-parametric supervised learning algorithm for Classification and Regression Tasks (CART).
It has a hierarchical tree structure consisting of a root node, branches, an internal node, and a leaf node. Internal nodes represent a dataset, branches represent the decision rules and leaves represent the outcomes in this tree-structured technique.
Algorithm:
- The root is the entire training set.
- The values of the root attribute are compared with the dataset. Based on comparison, it follows the branch and jumps to the next node.
- For the next node, it again compares the attribute value with the other sub-nodes and moves further.
- Repeat step 3 until it reaches the leaf node.
Types:
- Decision Tree Classifier (Categorical target Variable)
- Decision Tree regressor (Continuous target variable)
Implementation:
Here is the salary dataset,
Now, it is Implementation Time.
import pandas as pd
df=pd.read_csv("salaries.csv")
df.head()
inputs=df.drop("Salary_more_than_100k",axis="columns")
target=df["Salary_more_than_100k"]
from sklearn.preprocessing import LabelEncoder
le_Company=LabelEncoder()
le_Job=LabelEncoder()
le_Degree=LabelEncoder()
inputs["Company_n"]=le_Company.fit_transform(inputs["Company"])
inputs["Job_n"]=le_Job.fit_transform(inputs["Job"])
inputs["Degree_n"]=le_Degree.fit_transform(inputs["Degree"])
inputs.head()
inputs_n=inputs.drop(["Company","Job","Degree"],axis="columns")
inputs_n
from sklearn import tree
model=tree.DecisionTreeClassifier()
model.fit(inputs_n,target)
model.score(inputs_n,target)
model.predict([[2,2,1]])
model.predict([[2,0,1]])
Here you can access the full code:
Decision_Tree/Decision_Tree.ipynb at main · kaviya2478/Decision_Tree (github.com)
Thank you, Let’s drink some Chocolate Milk :)