Probabilistic Neural Network and Competitive Probabilistic Neural Network

Published in

Analytics Vidhya

6 min readSep 28, 2019

Good morning everybody. Today I would like to deliver a (hopefully) short presentation on Probabilistic Neural Network (PNN) and Competitive Probabilistic Neural Network (CPNN). This post refers to Zeinali and Story article:

Zeinali, Yasha & Story, Brett. (2017). Competitive probabilistic neural network. Integrated Computer-Aided Engineering. 24. 1–14. 10.3233/ICA-170540.

PNN was studied there as a preceding algorithm. I am going to use R as a programming tool here. Actually, there has been an R library entitled “pnn”. The library performs the PNN functionality and has a straightforward usage. However, I think I must make my own R script to carry out the CPNN procedure as an extension to the PNN. This is because I still cannot find the CPNN library under R’s repository. In the following script, I will also provide my own PNN function as well. Let’s start with the definition of squared distance function. By the way, please note that I use variable names as similar as specified by the paper. I also use “dplyr” and “ggplot2” library, so let’s load them first.

suppressMessages(suppressWarnings(library(dplyr)))
suppressMessages(suppressWarnings(library(ggplot2)))

squared.distance <- function(input1, input2) input1 %>% rbind(input2) %>% dist %>% '^'(2)

Then I defined omega.ij function. Please have read the paper so that you have a preliminary comprehension at least. For example, what this omega.ij described for. Nevertheless, it is also alright if you do not interested in what happen behind. Just copy every part of the script and apply the function to your dataset (as I will show you how, later at the end of this post).

omega.ij <- function(x.new, x.c.ij, sigma.input) {
  argument <- -squared.distance(x.new, x.c.ij) %>% '/'(2 * (sigma.input^2)) %>% exp
  return(argument / (
      ((2 * pi)^(length(x.new) / 2)) * (sigma.input^length(x.new))
    )
  )
}

Then I defined p.x.new.ci function as follows.

p.x.new.ci <- function(x.new, data.input, sigma.input, gamma.input = NULL) {
  if(!is.null(gamma.input) & is.numeric(gamma.input)) {
    omega.sum <- data.input %>% nrow %>% '*'(gamma.input) %>% round
    omega <- apply(data.input, 1, function(input1) 
      omega.ij(x.new, input1, sigma.input)) %>% sort %>% rev
    return(omega[1:omega.sum] %>% mean(na.rm = TRUE))
  } else {
    apply(data.input, 1, function(input1) 
      omega.ij(x.new, input1, sigma.input)) %>% mean(na.rm = TRUE) 
  }
}

Below is the definition of PNN and CPNN main function. Both algorithms share the same function. To differentiate if it is PNN or CPNN, the argument of ‘gamma.input’ must be filled with NULL for PNN and a real number ranges from 0 to 1 for CPNN. I gave The default value of NULL to the ‘gamma.input’ argument, so it means that the default calling to the function executes PNN algorithm.

c.pnn <- function(x.new, data.input, class.input, sigma.input, gamma.input = NULL) {
  class.index <- tapply(1:length(class.input), class.input, function(input1) return(input1))
  p.x.new.result <- sapply(
    class.index, function(input1) {
      p.x.new.ci(x.new, data.input[input1, ], sigma.input, gamma.input)
    }
  )
  p.x.new.result <- (p.x.new.result / sum(p.x.new.result)) %>% round(3)
  output <- data.frame(
    class.output = (class.input %>% as.factor %>% levels)[p.x.new.result %>% which.max],
    probability = p.x.new.result[p.x.new.result %>% which.max]
  )
  rownames(output) <- NULL
  return(output)
}

Now let’s try the algorithm. The authors illustrated a non linearly separated data (it was displayed as a figure in their paper). The class property of the observation were rectangle and circle.

data.input <- expand.grid(x1 = seq(0.5, 1.5, length.out = 3), x2 = seq(-0.5, 0.5, length.out = 5))
angle.1 <- seq(0, 2*pi, length.out = 24) + runif(24, 0, .2)
angle.2 <- seq(0, 2*pi, length.out = 12) + runif(12, 0, .2)
data.input <- rbind(
  data.input, cbind(
    x1 = c(cos(angle.1) + 3.5, (cos(angle.2) * 0.8) + 3.5),
    x2 = c(sin(angle.1), sin(angle.2) * 0.8)
  )
)
data.input <- rbind(
  data.input, cbind(x1 = c(3, 3.5, 3.5, 4), x2 = c(0, -0.5, 0.5, 0))
)
data.input$Class <- c(
  rep('rect', 15), rep('circ', 36), rep('rect', 4)
)

Let’s take a look at the data

print(data.input)##          x1           x2 Class
## 1  0.500000 -0.500000000  rect
## 2  1.000000 -0.500000000  rect
## 3  1.500000 -0.500000000  rect
## 4  0.500000 -0.250000000  rect
## 5  1.000000 -0.250000000  rect
## 6  1.500000 -0.250000000  rect## 50 4.180493 -0.420629481  circ
## 51 4.292294  0.110774459  circ
## 52 3.000000  0.000000000  rect
## 53 3.500000 -0.500000000  rect
## 54 3.500000  0.500000000  rect
## 55 4.000000  0.000000000  rect

and the figure.

ggplot(data.input, aes(x1, x2)) +
  geom_point(
    aes(color = Class), size = 3, shape = ifelse(data.input$Class == 'rect', 15, 16)
  ) + scale_x_continuous(limits = c(0, 6)) + scale_y_continuous(limits = c(-1.5, 1.5)) +
  theme_minimal()

Alright, now let’s create our testing data. Here, the scenario is that I would like to decide everywhere on the data dimensional space whether it is a rectangle or a circle. The first step is creating an expansion of values inside the space.

data.test <- expand.grid(x1 = seq(0, 6, length.out = 50), x2 = seq(-1.5, 1.5, length.out = 50))

Some top most of the data.test look like this:

head(data.test)##          x1   x2
## 1 0.0000000 -1.5
## 2 0.1224490 -1.5
## 3 0.2448980 -1.5
## 4 0.3673469 -1.5
## 5 0.4897959 -1.5
## 6 0.6122449 -1.5

Now let’s take a look at their location inside the data space.

ggplot(data.input, aes(x1, x2)) +
  geom_point(mapping = aes(x1, x2), data = data.test, color = gray(0, 0.2)) +
  geom_point(
    aes(color = Class), size = 3, shape = ifelse(data.input$Class == 'rect', 15, 16)
  ) +
  theme_minimal()

Test data in the data dimensional space.

One thing to remember is that we have to scale the values of both data.input and data.test. This process does not change the position. Only the values are changed proportionally in respect to the reference data. Here, the data.input is our reference dataset. So it is scaled to itself and data.test is scaled to the data.input values.

data.input.scaled <- scale(data.input[,1:2])
data.test.scaled <- scale(data.test, 
                          center = attr(data.input.scaled, 'scaled:center'), 
                          scale = attr(data.input.scaled, 'scaled:scale'))

Let’s see that the data are still located at the same formation relative to each other but at the different location relative to the dimensional space.

ggplot(data.input.scaled %>% as.data.frame, aes(x1, x2)) +
  geom_point(mapping = aes(x1, x2), data = data.test.scaled %>% as.data.frame, color = gray(0, 0.2)) +
  geom_point(
    aes(color = data.input$Class), size = 3, shape = ifelse(data.input$Class == 'rect', 15, 16)
  ) + 
  labs(color = 'Class') +
  theme_minimal()

The scaled version of train and test data.

Now we are ready to identify the class of the data.test using PNN and/or CPNN. It will take some time because our test dataset has 2500 observations.

data.test.result <- apply(data.test.scaled, 1, function(input) {
  c.pnn(x.new = input, data.input = data.input.scaled, class.input = data.input$Class, sigma.input = 0.1,
        gamma.input = 0.5)
})

Let’s see the top most of the output.

head(data.test.result)## [[1]]
##   class.output probability
## 1         rect           1
## 
## [[2]]
##   class.output probability
## 1         rect           1
## 
## [[3]]
##   class.output probability
## 1         rect           1
## 
## [[4]]
##   class.output probability
## 1         rect           1
## 
## [[5]]
##   class.output probability
## 1         rect           1
## 
## [[6]]
##   class.output probability
## 1         rect           1

The c.pnn function provides a vector having two elements, i.e. class.output and probability. The first element is the class that the algorithm decide. The second element is the probability value of the selected class among available class in the dataset. Therefor, the variable data.test.result is a list whose 2500 elements and each of it consists of a vector having two elements.

Now we plot our result by coloring the test dataset according to the class that the model has classified. To do this, first we have to extract the class.output element in each list element.

test.class <- sapply(data.test.result, function(input) input$class.output)combined.data <- data.frame(
  rbind(data.input, cbind(data.test, Class = test.class)),
  origin = c(rep('train', nrow(data.input)), rep('test', nrow(data.test)))
) 

ggplot(combined.data %>% filter(origin == 'test'), aes(x1, x2)) +
  geom_point(aes(color = Class)) +
  geom_point(aes(x1, x2, fill = Class, shape = Class), data.input, inherit.aes = FALSE, size = 3) +
  scale_shape_manual(values = c(21, 22)) + 
  theme_minimal()

From the figure above, we can conclude that our script of CPNN has been working. It is able to distinguish the non linearly separable area of circle and rectangle. I think it is time to leave the reader to try the other algorithm: PNN. It is performed by simply setting the argument ‘gamma.input’ with NULL. There are still many things that we can tweak from this PNN/CPNN script, for example, the execution time improvement. Another one is the sensitivity response of changing sigma and/or gamma inputs. Also, mixing categorical and numerical features. I would also like to enclose with a disclaimer that I do not take any responsibility regarding to the usage of the script above. It is my personal work and comes with ABSOLUTELY NO WARRANTY.

Probabilistic Neural Network and Competitive Probabilistic Neural Network

Written by Hastuadi Harsa