Similarities between two people using classification techniques.

DHANUSH PABBATHI
Nov 5 · 5 min read

We are using Classification techniques like CART, Logistic Regression, Random Forest,Decision tree,Ada boost under SVM , KNN to classify the Airbus and Boeing

Modules involved:

  1. Visualizing our Data , finding correlation among features and target label

2. Splitting the data into training samples and testing samples

3. Using classification techniques and finding the accuracy of the model

4. Analyzing different classification metrics like MSE, RMSE, Precision, Recall , Accuracy etc.

5. Concluding the best model.

About The Data :

Hear we have taken the data from the google search, I have taken two images of two scientists (Einstein and Hubble) in the data set to create these models, they are entirely different from the face expressions, the models which I am going to explain in classification are the following:

  1. Perceptron

2. Decision Tree

3. Logistic Regression

4. Neural Networks

5. Random Forest

I will be using Microsoft Azure Notebooks to do this so first we need to install OpenCV in our notebook.

Now we will import and cv2

now we will install open-cv python the after we import numpy

1.Loading Data, pre processing our data

First, we need to upload or clone our data set to azure notebook then we need to load it to our python file.

To load the Data set of Einstein image to our ipynb file we need to run the following code.

In the above code, I have first created an empty list then loaded all my images to a numpy array, converted into gray scale then resized and flatten it and converted each image to an array and appended it to my input1 list.

similarly, we need to load all the remaining data set images of Einstein and then we will classify.

Now we will, Display the image in grey scale to detect pointy nose

Now plot the image in a convolution layer for 4 filters layers

The main plot of our algorithm is to detect and differentaiate between Einstein and Hubble with edges and pointy nose.so we will detect the edge using 3X3 matrices of array edge detection.

Similarly Now we will, Display the Hubble image in grey scale to detect pointy nose.

Similarly as we done for airbus now we will plot the Hubble image in a convolution layer for 4 filters layers

Resize the dataset:

Now we need to assign outputs to this data set and create a complete feature that will be used by our learning algorithm.And append the two sets of array data into one array

Assign 1st 10 data set as 1,a and next 10 data set as 2 in a new array of data set.

Splitting the data to test and training set:

make sure you take 60–70% of your data as train data and the rest as testing data.We are now ready to train our models.

1)Perceptron:

A Perceptron is a neural network unit that does certain computations to detect features or business intelligence in the input data.

The following is the code to implement the perceptron.

we have trained our perceptron model lets now check its accuracy.

2)Decision Tree:

We will first import the necessary libraries and train our decision tree model.

Within the decision tree calssifier, we are finding similarity accuracy between two flights using F1 Score.

here is the code for it,

we can check the accuracy of the model by using the following code.

3)Logistic Regression:

We import the required libraries first then train our model.

accuracy can be known by the following

4)Neural Networks:

Artificial neural networks or connectionist systems are computing systems that are inspired by, but not identical to, biological neural networks that constitute animal brains.

Generally accuracy for neural networks is high compared to other models.

5) Random Forest:

The random forest is a classification algorithm consisting of many decisions trees. It uses bagging and feature randomness when building each individual tree to try to create an uncorrelated forest of trees whose prediction by committee is more accurate than that of any individual tree.

The following is the code and accuracy

Summary:

We started with the data exploration where we got a feeling for the dataset,we loaded our dataset and converted it into a numpy array similarly we did the same thing for our outputs and created features. This is our preprocessing part after this our data is ready for training thereby we created each model and trained it with our dataset and checked accuracy of each model.By now we can differentiate the differnce in boeing and airbus by differnet algorithm.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade