Illustrative picture. Source : Design of a visualization scheme for functional connectivity data of Human Brain, Lennart Bramlage, Bachelor Thesis, Hochschule Osnabrück — University of Applied Sciences, 2017.

Artificial Neural Networks — Multi Layer Perceptron applied to the Iris Data Set Classification

Robson Fernandes | Ph.D Candidate | Data Scientist

Published in

SciCognos

5 min readDec 15, 2019

Artificial neural networks (ANN) or connectionist systems are computing systems that are inspired by, but not identical to, biological neural networks that constitute animal brains.

An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain. Each connection, like the synapses in a biological brain, can transmit a signal to other neurons. An artificial neuron that receives a signal then processes it and can signal neurons connected to it.

The original goal of the ANN approach was to solve problems in the same way that a human brain would. However, over time, attention moved to performing specific tasks, leading to deviations from biology. ANNs have been used on a variety of tasks, including computer vision, speech recognition, machine translation, social network filtering, playing board and video games and medical diagnosis.

Figure 1. An artificial neural network is an interconnected group of nodes, inspired by a simplification of neurons in a brain. Here, each circular node represents an artificial neuron and an arrow represents a connection from the output of one artificial neuron to the input of another.

Multi Layer Perceptron applied to the Iris Data Set Classification

Data Set Information

The data set consists of 50 samples from each of three species of Iris (Iris setosa, Iris virginica and Iris versicolor). Four features were measured from each sample: the length and the width of the sepals and petals, in centimeters.

Attribute Information

sepal length in cm
sepal width in cm
petal length in cm
petal width in cm
class: (Iris setosa, Iris virginica and Iris versicolor)

Modeling in R code

Install Libraries

If you do not have the packages needed to develop the model, you should install it with the instructions below.

install.packages("neuralnet")
install.packages("NeuralNetTools")
install.packages("ggplot2")
install.packages("GGally")
install.packages("caret")

Import Libraries

Import the libraries necessary to development and resolution this problem.

library("neuralnet")
library("NeuralNetTools")
library("ggplot2")
library("GGally")
library("caret")

Load Iris

The Iris data set exist inside the R, so, you can import conform the code below.

data(iris)

Exploratory Data Analysis

In statistics, exploratory data analysis (EDA) is an approach to analyzing data sets to summarize their main characteristics, often with visual methods.

Correlation Analysis

ggplot <- function(...) 
  ggplot2::ggplot(...) + 
  scale_color_brewer(palette="Purples") + 
  scale_fill_brewer(palette="Purples")

unlockBinding("ggplot",parent.env(asNamespace("GGally")))
assign("ggplot",ggplot,parent.env(asNamespace("GGally")))

graph_corr <- ggpairs(iris, mapping = aes(color = Species), 
                      columns = c('Sepal.Length', 
                                  'Sepal.Width', 
                                  'Petal.Length', 
                                  'Petal.Width', 
                                  'Species'), 
                      columnLabels = c('Sepal.Length', 
                                       'Sepal.Width', 
                                       'Petal.Length', 
                                       'Petal.Width', 
                                       'Species')) 
graph_corr <- graph_corr + theme_minimal()
graph_corr

Output Correlation Analysis

Normalization and Transform Data

One of the most important procedures when forming a neural network is data normalization. This involves adjusting the data to a common scale so as to accurately compare predicted and actual values. Failure to normalize the data will typically result in the prediction value remaining the same across all observations, regardless of the input values.

We can do this in two ways in R:

Scale the data frame automatically using the scale function in R
Transform the data using a max-min normalization technique

In this example, will be to use Max-Min Normalization function.

norm.fun = function(x){(x - min(x))/(max(x) - min(x))}

Apply the function in the dataset to normalize data.

df_iris = iris[,c("Sepal.Length","Sepal.Width",
                  "Petal.Length","Petal.Width" )]

df_iris = as.data.frame(apply(df_iris, 2, norm.fun))

df_iris$Species = iris$Species

df_iris$setosa <- df_iris$Species=="setosa"
df_iris$virginica <- df_iris$Species == "virginica"
df_iris$versicolor <- df_iris$Species == "versicolor"

Split Data — Training and Test data set

Now we need split data between training and test data set. In code below, the training sample size represent 75% of data set total, and 25% represent the test data set.

## 75% of the sample size
smp_size <- floor(0.75 * nrow(df_iris))

## set the seed to make your partition reproducible
set.seed(123)
train_ind <- sample(seq_len(nrow(df_iris)), size = smp_size)

training.set <- df_iris[train_ind, ]
test.set <- df_iris[-train_ind, ]

Neural Network Model Fit

The neural network model contain 4 variables in input layer, 2 hidden layers containing in each layer 10 neurons.

The number of repetitions for the neural network’s training is equal 5.

The activation function used is logistic, and the function that is used for the calculation of the error is ce (cross-entropy).

model = as.formula("Species ~ 
                           Sepal.Length + 
                           Sepal.Width + 
                           Petal.Length + 
                           Petal.Width")

iris.net <- neuralnet(model,
                      data=training.set, 
                      hidden=c(10,10), 
                      rep = 5, 
                      act.fct = "logistic",
                      err.fct = "ce",
                      linear.output = F, 
                      lifesign = "minimal", 
                      stepmax = 1000000,
                      threshold = 0.001)

Output Neural Network Model Fit

hidden: 10,10  thresh: 0.001  rep: 1/5  steps: 5796 error: 7e-04 time: 1.39 secs

hidden: 10,10  thresh: 0.001  rep: 2/5  steps: 1551 error: 0.00107 time: 0.33 secs

hidden: 10,10  thresh: 0.001  rep: 3/5  steps: 6682 error: 0.00063 time: 1.43 secs

hidden: 10,10  thresh: 0.001  rep: 4/5  steps: 3973 error: 0.00099 time: 0.86 secs

hidden: 10,10  thresh: 0.001  rep: 5/5  steps: 3525 error: 0.00109 time: 0.75 secs

Neural Network Model Visualization

Visualize the neural network architecture with the code below.

plotnet(iris.net, 
        alpha.val = 0.8, 
        circle_col = list('purple', 'white', 'white'), 
        bord_col = 'black')

Output Neural Network Model Visualization

Prediciton

Use the test data set as input to the Neural Network model to predict Iris classes.

iris.prediction <- compute(iris.net, test.set)

idx <- apply(iris.prediction$net.result, 1, which.max)

predicted <- as.factor(c('setosa', 'versicolor', 'virginica')[idx])

Metrics to Evaluate

Confusion Matrix

A confusion matrix, also known as an error matrix, is a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one. Each row of the matrix represents the instances in a predicted class while each column represents the instances in an actual class (or vice versa).

confusionMatrix(predicted, test.set$Species)

Output Confusion Matrix

Confusion Matrix and Statistics

            Reference
Prediction   setosa versicolor virginica
  setosa         11          0         0
  versicolor      0         13         1
  virginica       0          0        13

Overall Statistics
                                          
               Accuracy : 0.9737          
                 95% CI : (0.8619, 0.9993)
    No Information Rate : 0.3684          
    P-Value [Acc > NIR] : 2.196e-15       
                                          
                  Kappa : 0.9604

According to the confusion matrix output applied in the test set, we had 97% accuracy in the developed neural network model.

Additional Information

Learn more about the various Neural Network architectures “Neural Network Zoo”.

To the next…

I hope this approach can contribute to those who are starting in the area of Data Science, whether Statistics, Mathematicians, Computer Scientists or students who have an interest in the subject.