# Using Learning Vector Quantization for Classification in R

The Learning Vector Quantization algorithm (LVQ) is an artificial neural network algorithm that lets you choose how many training instances you wish to work with and learns exactly what those instances should look like.

In this article,we’ll look at the following key points:

- The representation used by the LVQ algorithm that you actually save to a file.
- The procedure that you can use to make predictions with a learned LVQ model.
- How to learn an LVQ model from training data.

# LVQ Model Representation

LVQ is best understood as a classification algorithm. It supports both binary and multi-class classification problems.

The representation for LVQ is a collection of codebook vectors. LVQ model creates codebook vectors by learning training dataset. Codebook vectors represent class regions. For example, if your problem is a binary classification with classes 0 and 1, and the inputs Car insurance, Health insurance & Home insurance, then a codebook vector would be comprised of all four attributes: Car insurance, Health insurance & Home insurance and class.

The model representation is a fixed pool of codebook vectors that look like training instances, but the values of each attribute have been adapted based on the learning procedure.

# Building an LVQ Model

Install the ‘class’ package : ‘class’ library provides a required function for this classification. There are modified versions of LVQ function such as lvq1(), olvq1(), lvq2(), and lvq3(). We use olqv1(), optimized LVQ function in this tutorial.

library(class)

library(caret)//Preparing Dataset.seed(88)

n = 10000

a = sample(1:10, n, replace = T)

b = sample(10:20, n, replace = T)

f = ifelse(a > 5 & b > 10, "red",

ifelse(a < 3 | b < 4, "yellow", "green"))

df = data.frame(a = a, b = b, flag = as.factor(f))head(df)a b flag//Splitting Data in Training & Test Set.set.seed(88)

1 3 13 green

2 8 13 red

3 5 19 green

4 9 13 red

5 10 11 red

6 1 13 yellow

split<-(df$flag,SplitRatio=0.8)

train_d<-subset(df,split==TRUE)

test_d<-subset(df,split==FALSE)//Convert Split Datasets into a matrix type. train = data.matrix(train_d[, c("a","b")])

test = data.matrix(test_d[, c("a","b")])

train_label = factor(train_d[, "flag"])

test_label = test_d$flag//Building a codebook for LVQcodeBook = lvqinit(train, train_label, size = 100)

olvq1() represents the training set in a codebook.

buildCodeBook = olvq1(train, train_label, codeBook)//Prediction phasepredict = lvqtest(buildCodeBook, test)

Now follow the common practice of Creating Confusion Matrix to check the Accuracy

confusionMatrix(test_label, predict)Confusion Matrix and Statistics

Reference

Prediction green red yellow

green 703 0 0

red 0 896 0

yellow 0 0 399

Overall Statistics

Accuracy : 1

95% CI : (0.9982, 1)

No Information Rate : 0.4484

P-Value [Acc > NIR] : < 2.2e-16

Kappa : 1

Mcnemar's Test P-Value : NA

Statistics by Class:

Class: green Class: red Class: yellow

Sensitivity 1.0000 1.0000 1.0000

Specificity 1.0000 1.0000 1.0000

Pos Pred Value 1.0000 1.0000 1.0000

Neg Pred Value 1.0000 1.0000 1.0000

Prevalence 0.3519 0.4484 0.1997

Detection Rate 0.3519 0.4484 0.1997

Detection Prevalence 0.3519 0.4484 0.1997

Balanced Accuracy 1.0000 1.0000 1.0000

In this brief article we worked on how to classify data in R using LVQ & i hope you found it useful.