Compare Tensorflow Deep Learning Model with Classical Machine Learning models — KNN, Naive Bayes, Logistic Regression, SVM — IRIS Classification
In this exercise we will build classical machine learning Models for IRIS flower prediction, You will learn how to build models for KNN, Naive Bayes, Logistic Regression and SVM. Then Compare the results with Deep Learning Model built with Tensorflow/Keras framework
IRIS dataset has totally 150 Samples. There are 3 different types of flowers, Iris Setosa, Iris Virginica and Iris versicolor, 50 of each type. Each flower sample consists of 4 different attributes/features of the flowers namely Length and Width of Sepals and Petals, in centimeters.
Import Libraries
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn import datasets,svm, metrics
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import confusion_marix, accuracy_score
from sklearn.linear_model import LogisticRegression
Load the Data
iris = datasets.load_iris()
Check the Data (iris.data, iris.target iris.feature_names)
iris.data.shape
(150, 4)
iris.target.shape
(150,)
iris.feature_names
['sepal length (cm)',
'sepal width (cm)',
'petal length (cm)',
'petal width (cm)']
iris.target
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
Create a Pandas Dataframe to Hold iris.data and iris.target with features iris.feature_names
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df[‘target’] = pd.Series(iris.target)
Check the DataFrame
df.head()
df[‘target’].value_counts() ## There are 50 flowers of each type
2 50
1 50
0 50
Name: target, dtype: int64
Plot Sepal Length against Sepal Width
Check the Feature Correlation
Feature correlation is done by calling pandas function df.corr(), This will tell us which features are most correlated to the Target Variable, Here we notice that all 4 features are well correlated to the Target. So we need to use all 4 features in ML Model to predict IRIS flower type
import seaborn as sns; sns.set(style=”ticks”, color_codes=True)
plt.figure(figsize=(10,10))
#plot heat map
corr_matrix= df.corr()
g=sns.heatmap(corr_matrix,annot=True, cmap=”RdYlGn”)
Extract X and Y
X=df.iloc[:,0:4]
y = df[‘target’]
Split Test and Train Data
Split Train and Test, Here the Ratio of split is 80:20 , Which is very typical in most ML Problems. Also note that we have set ‘stratify=y’ which will make sure that Ratio of Target Labels in the Test and Train Data Remains same. In this case we have 1:1:1, This ratio of Target Labels is maintained across the split.
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size =0.2, shuffle=True, random_state=35, stratify=y)
Define the Model, Train and Test the Model
Three steps are involved in this
- Define the Model by instantiating SKLEARN function
2. model.fit() — Train the Model
3. model.predict() — Prediction on the Test Data
1. K-Nearest Neighbor Classifier (KNN) from SKLearn
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
knn.score(X_test,y_test)
0.9333333333333333
confusion_matrix(y_test, y_pred)
array([[10, 0, 0],
[ 0, 9, 1],
[ 0, 1, 9]], dtype=int64)
2. GuassianNB Classifier from SKLearn
classifier=GaussianNB()
classifier.fit(X_train,y_train)
y_pred=classifier.predict(X_test)
score = accuracy_score(y_test, y_pred)
print(“Accuracy”,score)
Accuracy 0.8666666666666667
confusion_matrix(y_test, y_pred)
array([[10, 0, 0],
[ 0, 8, 2],
[ 0, 2, 8]], dtype=int64)
3. Logistic Regression Classifier from SKLearn
logistic_regression = LogisticRegression(C=25.0,solver=’lbfgs’,multi_class=’auto’,max_iter=1000)
logistic_regression.fit(X_train,y_train)
y_pred=logistic_regression.predict(X_test)
score = accuracy_score(y_test, y_pred)
score
0.9333333333333333
confusion_matrix(y_test, y_pred)
array([[10, 0, 0],
[ 0, 9, 1],
[ 0, 1, 9]], dtype=int64)
4. SVM Classifier
classifier = svm.SVC(gamma=0.001)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
score = accuracy_score(y_test, y_pred)
score
0.8
confusion_matrix(y_test, y_pred)
array([[10, 0, 0],
[ 0, 9, 1],
[ 0, 5, 5]], dtype=int64)
5. Deep Learning with Tensorflow/Keras
import tensorflow as tf
from tensorflow import keras
Define Tensorflow/Keras Model
Define the Deep Neural network model with input_shape = 4 as we have 4 input features, 3 Layers with 5,3,3 Neuron units respectively. Output has to be 3 units has we have 3 classes to predict.
Input (4 Inputs) → X1,X2,X3,X4 → 5 Neurons → 3 Neurons → 3 Output Neurons
Note that one can experiment with different number of Neurons in the 1st (5 Neurons currently) and 2nd Layer (3 Neurons currently)
model = keras.Sequential([
keras.layers.Dense(5, activation=tf.nn.relu, input_shape=[4]),
keras.layers.Dense(3, activation=tf.nn.relu),
keras.layers.Dense(3, activation=tf.nn.softmax)
])
Define the Optimizer, Loss Function and Metrics
optimizer = tf.keras.optimizers.Adam()
model.compile(optimizer=’adam’,
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[‘accuracy’])
Finally train the Model with model.fit function
model.fit(X_train,y_train,epochs=2000)
Model started with low accuracy at first epoch
Epoch 1/2000
4/4 [==============================] - 0s 2ms/step - loss: 1.1194 - accuracy: 0.3333
Converged with High accuracy near 2000 epochs
Epoch 2000/2000
4/4 [==============================] - 0s 2ms/step - loss: 0.6042 - accuracy: 0.9750
Predict on Test Data
y_pred = model.predict(X_test)
This produces array of 3 numbers for each input, But we need only one Target Label as predicted
array([[1.4601685e-03, 3.7447622e-03, 9.9479502e-01],
[9.9999571e-01, 4.8816267e-09, 4.2581619e-06],
[1.4601685e-03, 3.7447622e-03, 9.9479502e-01],
[3.4476686e-03, 9.1026098e-02, 9.0552616e-01],
[1.4601685e-03, 3.7447622e-03, 9.9479502e-01],
[2.5093075e-06, 9.9999738e-01, 1.2542409e-07],
[1.4601685e-03, 3.7447622e-03, 9.9479502e-01],
[8.4952539e-04, 9.5092177e-01, 4.8228655e-02],
Predicted Label is index of Max of these three numbers, So this is got by np.argmax()
y_pred_new = np.argmax(y_pred, axis=1)
score = accuracy_score(y_test, y_pred_new)
score
0.9666666666666667
confusion_matrix(y_test, y_pred)
array([[10, 0, 0],
[ 0, 9, 1],
[ 0, 0, 10]], dtype=int64)
Conclusion
Note the accuracy of different Models here, We got accuracy of 0.8 →0.86 → 0.93 →0.93 →0.967 for SVM, Guassian Naive Bayes, Logistic Regression, KNN and Deep Learning Model respectively
Deep Learning techniques can learn and predict any Target functions. We can fine tune Deep Learning Model to perform better than classical machine learning Models, But the downside of Deep Learning is Complexity of the Network and need for large Dataset. Deep learning Models have higher tendency to Overfit, but there are techniques available to combat overfitting.