Artificial Neural Networks — Multi Layer Perceptron applied to the Iris Data Set Classification
Artificial neural networks (ANN) or connectionist systems are computing systems that are inspired by, but not identical to, biological neural networks that constitute animal brains.
An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron that receives a signal then processes it and can signal neurons connected to it.
The original goal of the ANN approach was to solve problems in the same way that a human brain would. However, over time, attention moved to performing specific tasks, leading to deviations from biology. ANNs have been used on a variety of tasks, including computer vision, speech recognition, machine translation, social network filtering, playing board and video games and medical diagnosis.
Figure 1. An artificial neural network is an interconnected group of nodes, inspired by a simplification of neurons in a brain. Here, each circular node represents an artificial neuron and an arrow represents a connection from the output of one artificial neuron to the input of another.
Multi Layer Perceptron applied to the Iris Data Set Classification
Data Set Information
The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters.
Attribute Information
- sepal length in cm
- sepal width in cm
- petal length in cm
- petal width in cm
- class: (Iris setosa, Iris virginica and Iris versicolor)
Modeling in R code
Install Libraries
If you do not have the packages needed to develop the model, you should install it with the instructions below.
install.packages("neuralnet")
install.packages("NeuralNetTools")
install.packages("ggplot2")
install.packages("GGally")
install.packages("caret")
Import Libraries
Import the libraries necessary to development and resolution this problem.
library("neuralnet")
library("NeuralNetTools")
library("ggplot2")
library("GGally")
library("caret")
Load Iris
The Iris data set exist inside the R, so, you can import conform the code below.
data(iris)
Exploratory Data Analysis
In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.
Correlation Analysis
ggplot <- function(...)
ggplot2::ggplot(...) +
scale_color_brewer(palette="Purples") +
scale_fill_brewer(palette="Purples")
unlockBinding("ggplot",parent.env(asNamespace("GGally")))
assign("ggplot",ggplot,parent.env(asNamespace("GGally")))
graph_corr <- ggpairs(iris, mapping = aes(color = Species),
columns = c('Sepal.Length',
'Sepal.Width',
'Petal.Length',
'Petal.Width',
'Species'),
columnLabels = c('Sepal.Length',
'Sepal.Width',
'Petal.Length',
'Petal.Width',
'Species'))
graph_corr <- graph_corr + theme_minimal()
graph_corr
Output Correlation Analysis
Normalization and Transform Data
One of the most important procedures when forming a neural network is data normalization. This involves adjusting the data to a common scale so as to accurately compare predicted and actual values. Failure to normalize the data will typically result in the prediction value remaining the same across all observations, regardless of the input values.
We can do this in two ways in R:
- Scale the data frame automatically using the scale function in R
- Transform the data using a max-min normalization technique
In this example, will be to use Max-Min Normalization function.
norm.fun = function(x){(x - min(x))/(max(x) - min(x))}
Apply the function in the dataset to normalize data.
df_iris = iris[,c("Sepal.Length","Sepal.Width",
"Petal.Length","Petal.Width" )]
df_iris = as.data.frame(apply(df_iris, 2, norm.fun))
df_iris$Species = iris$Species
df_iris$setosa <- df_iris$Species=="setosa"
df_iris$virginica <- df_iris$Species == "virginica"
df_iris$versicolor <- df_iris$Species == "versicolor"
Split Data — Training and Test data set
Now we need split data between training and test data set. In code below, the training sample size represent 75% of data set total, and 25% represent the test data set.
## 75% of the sample size
smp_size <- floor(0.75 * nrow(df_iris))
## set the seed to make your partition reproducible
set.seed(123)
train_ind <- sample(seq_len(nrow(df_iris)), size = smp_size)
training.set <- df_iris[train_ind, ]
test.set <- df_iris[-train_ind, ]
Neural Network Model Fit
The neural network model contain 4 variables in input layer, 2 hidden layers containing in each layer 10 neurons.
The number of repetitions for the neural network’s training is equal 5.
The activation function used is logistic, and the function that is used for the calculation of the error is ce (cross-entropy).
model = as.formula("Species ~
Sepal.Length +
Sepal.Width +
Petal.Length +
Petal.Width")
iris.net <- neuralnet(model,
data=training.set,
hidden=c(10,10),
rep = 5,
act.fct = "logistic",
err.fct = "ce",
linear.output = F,
lifesign = "minimal",
stepmax = 1000000,
threshold = 0.001)
Output Neural Network Model Fit
hidden: 10,10 thresh: 0.001 rep: 1/5 steps: 5796 error: 7e-04 time: 1.39 secs
hidden: 10,10 thresh: 0.001 rep: 2/5 steps: 1551 error: 0.00107 time: 0.33 secs
hidden: 10,10 thresh: 0.001 rep: 3/5 steps: 6682 error: 0.00063 time: 1.43 secs
hidden: 10,10 thresh: 0.001 rep: 4/5 steps: 3973 error: 0.00099 time: 0.86 secs
hidden: 10,10 thresh: 0.001 rep: 5/5 steps: 3525 error: 0.00109 time: 0.75 secs
Neural Network Model Visualization
Visualize the neural network architecture with the code below.
plotnet(iris.net,
alpha.val = 0.8,
circle_col = list('purple', 'white', 'white'),
bord_col = 'black')
Output Neural Network Model Visualization
Prediciton
Use the test data set as input to the Neural Network model to predict Iris classes.
iris.prediction <- compute(iris.net, test.set)
idx <- apply(iris.prediction$net.result, 1, which.max)
predicted <- as.factor(c('setosa', 'versicolor', 'virginica')[idx])
Metrics to Evaluate
Confusion Matrix
A confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one. Each row of the matrix represents the instances in a predicted class while each column represents the instances in an actual class (or vice versa).
confusionMatrix(predicted, test.set$Species)
Output Confusion Matrix
Confusion Matrix and Statistics
Reference
Prediction setosa versicolor virginica
setosa 11 0 0
versicolor 0 13 1
virginica 0 0 13
Overall Statistics
Accuracy : 0.9737
95% CI : (0.8619, 0.9993)
No Information Rate : 0.3684
P-Value [Acc > NIR] : 2.196e-15
Kappa : 0.9604
According to the confusion matrix output applied in the test set, we had 97% accuracy in the developed neural network model.
Additional Information
Learn more about the various Neural Network architectures “Neural Network Zoo”.
To the next…
I hope this approach can contribute to those who are starting in the area of Data Science, whether Statistics, Mathematicians, Computer Scientists or students who have an interest in the subject.