Compare Tensorflow Deep Learning Model with Classical Machine Learning models — KNN, Naive Bayes, Logistic Regression, SVM — IRIS Classification

Nagaraj B
Analytics Vidhya
Published in
5 min readOct 19, 2020

In this exercise we will build classical machine learning Models for IRIS flower prediction, You will learn how to build models for KNN, Naive Bayes, Logistic Regression and SVM. Then Compare the results with Deep Learning Model built with Tensorflow/Keras framework

IRIS dataset has totally 150 Samples. There are 3 different types of flowers, Iris Setosa, Iris Virginica and Iris versicolor, 50 of each type. Each flower sample consists of 4 different attributes/features of the flowers namely Length and Width of Sepals and Petals, in centimeters.

Import Libraries

import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split

from sklearn import datasets,svm, metrics
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import confusion_marix, accuracy_score
from sklearn.linear_model import LogisticRegression

Load the Data

iris = datasets.load_iris()

Check the Data (iris.data, iris.target iris.feature_names)

iris.data.shape

(150, 4)

iris.target.shape

(150,)

iris.feature_names

['sepal length (cm)',
'sepal width (cm)',
'petal length (cm)',
'petal width (cm)']

iris.target

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

Create a Pandas Dataframe to Hold iris.data and iris.target with features iris.feature_names

df = pd.DataFrame(iris.data, columns=iris.feature_names)
df[‘target’] = pd.Series(iris.target)

Check the DataFrame

df.head()

df[‘target’].value_counts() ## There are 50 flowers of each type

2    50
1 50
0 50
Name: target, dtype: int64

Plot Sepal Length against Sepal Width

Check the Feature Correlation

Feature correlation is done by calling pandas function df.corr(), This will tell us which features are most correlated to the Target Variable, Here we notice that all 4 features are well correlated to the Target. So we need to use all 4 features in ML Model to predict IRIS flower type

import seaborn as sns; sns.set(style=”ticks”, color_codes=True)
plt.figure(figsize=(10,10))
#plot heat map
corr_matrix= df.corr()
g=sns.heatmap(corr_matrix,annot=True, cmap=”RdYlGn”)

Extract X and Y

X=df.iloc[:,0:4]
y = df[‘target’]

Split Test and Train Data

Split Train and Test, Here the Ratio of split is 80:20 , Which is very typical in most ML Problems. Also note that we have set ‘stratify=y’ which will make sure that Ratio of Target Labels in the Test and Train Data Remains same. In this case we have 1:1:1, This ratio of Target Labels is maintained across the split.

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size =0.2, shuffle=True, random_state=35, stratify=y)

Define the Model, Train and Test the Model

Three steps are involved in this

  1. Define the Model by instantiating SKLEARN function

2. model.fit() — Train the Model

3. model.predict() — Prediction on the Test Data

1. K-Nearest Neighbor Classifier (KNN) from SKLearn

from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier(n_neighbors=5)
knn.fit(X_train, y_train)
y_pred = knn.predict(X_test)
knn.score(X_test,y_test)

0.9333333333333333

confusion_matrix(y_test, y_pred)

array([[10,  0,  0],
[ 0, 9, 1],
[ 0, 1, 9]], dtype=int64)

2. GuassianNB Classifier from SKLearn

classifier=GaussianNB()
classifier.fit(X_train,y_train)
y_pred=classifier.predict(X_test)
score = accuracy_score(y_test, y_pred)
print(“Accuracy”,score)

Accuracy 0.8666666666666667

confusion_matrix(y_test, y_pred)

array([[10,  0,  0],
[ 0, 8, 2],
[ 0, 2, 8]], dtype=int64)

3. Logistic Regression Classifier from SKLearn

logistic_regression = LogisticRegression(C=25.0,solver=’lbfgs’,multi_class=’auto’,max_iter=1000)
logistic_regression.fit(X_train,y_train)
y_pred=logistic_regression.predict(X_test)
score = accuracy_score(y_test, y_pred)
score

0.9333333333333333

confusion_matrix(y_test, y_pred)

array([[10,  0,  0],
[ 0, 9, 1],
[ 0, 1, 9]], dtype=int64)

4. SVM Classifier

classifier = svm.SVC(gamma=0.001)
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
score = accuracy_score(y_test, y_pred)
score

0.8

confusion_matrix(y_test, y_pred)

array([[10,  0,  0],
[ 0, 9, 1],
[ 0, 5, 5]], dtype=int64)

5. Deep Learning with Tensorflow/Keras

A Simple Neural Network

import tensorflow as tf
from tensorflow import keras

Define Tensorflow/Keras Model

Define the Deep Neural network model with input_shape = 4 as we have 4 input features, 3 Layers with 5,3,3 Neuron units respectively. Output has to be 3 units has we have 3 classes to predict.

Input (4 Inputs) → X1,X2,X3,X4 → 5 Neurons → 3 Neurons → 3 Output Neurons

Note that one can experiment with different number of Neurons in the 1st (5 Neurons currently) and 2nd Layer (3 Neurons currently)

model = keras.Sequential([
keras.layers.Dense(5, activation=tf.nn.relu, input_shape=[4]),
keras.layers.Dense(3, activation=tf.nn.relu),
keras.layers.Dense(3, activation=tf.nn.softmax)
])

Define the Optimizer, Loss Function and Metrics

optimizer = tf.keras.optimizers.Adam()
model.compile(optimizer=’adam’,
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=[‘accuracy’])

Finally train the Model with model.fit function

model.fit(X_train,y_train,epochs=2000)

Model started with low accuracy at first epoch

Epoch 1/2000
4/4 [==============================] - 0s 2ms/step - loss: 1.1194 - accuracy: 0.3333

Converged with High accuracy near 2000 epochs

Epoch 2000/2000
4/4 [==============================] - 0s 2ms/step - loss: 0.6042 - accuracy: 0.9750

Predict on Test Data

y_pred = model.predict(X_test)

This produces array of 3 numbers for each input, But we need only one Target Label as predicted

array([[1.4601685e-03, 3.7447622e-03, 9.9479502e-01],
[9.9999571e-01, 4.8816267e-09, 4.2581619e-06],
[1.4601685e-03, 3.7447622e-03, 9.9479502e-01],
[3.4476686e-03, 9.1026098e-02, 9.0552616e-01],
[1.4601685e-03, 3.7447622e-03, 9.9479502e-01],
[2.5093075e-06, 9.9999738e-01, 1.2542409e-07],
[1.4601685e-03, 3.7447622e-03, 9.9479502e-01],
[8.4952539e-04, 9.5092177e-01, 4.8228655e-02],

Predicted Label is index of Max of these three numbers, So this is got by np.argmax()

y_pred_new = np.argmax(y_pred, axis=1)

score = accuracy_score(y_test, y_pred_new)
score

0.9666666666666667

confusion_matrix(y_test, y_pred)

array([[10,  0,  0],
[ 0, 9, 1],
[ 0, 0, 10]], dtype=int64)

Conclusion

Note the accuracy of different Models here, We got accuracy of 0.8 →0.86 → 0.93 →0.93 →0.967 for SVM, Guassian Naive Bayes, Logistic Regression, KNN and Deep Learning Model respectively

Deep Learning techniques can learn and predict any Target functions. We can fine tune Deep Learning Model to perform better than classical machine learning Models, But the downside of Deep Learning is Complexity of the Network and need for large Dataset. Deep learning Models have higher tendency to Overfit, but there are techniques available to combat overfitting.

--

--

Nagaraj B
Analytics Vidhya

Machine Learning Trainer and Entrepreneur @Data Grounded