Scale Invariant Feature Transform for Cirebon Mask Classification Using MATLAB

Fendy Hendriyanto
The Startup
Published in
9 min readJun 9, 2020
Panji Sutrawinangun or Pamindo in the Losari mask which is part of the Cirebon mask dance

The vast diversity of art in Indonesia generates many interests both domestically and internationally. Arts particularly in Java Island come in many unique shapes, especially in the form of a mask. The mask arts can be seen in many regions including West Java, Cirebon, Surakarta, Yogyakarta, East Java; each carries distinctive features that rhyme accordingly to the folklore and value of the region represented by the mask. The masks coming from the Betawi region, for instance, show unique features than those of Situbondo, Malang, Ponorogo, or Madura.

Cirebon is considered one with the most popular mask design. In parallel to the local story of Cirebon, there are five types of human-shape masks: Panji, Samba or Pamindo, Rumyang, Tumenggung, and Klana. Each type shows unique expressions and features. Sadly, a traditional mask such as the Cirebon mask is on the brink of extinction. This happens because the new generation does not take any interest in learning about their culture. From this problem, we tried to preserve the Cirebon mask understanding with an artificial intelligence approach using digital image processing techniques.

In this research, I tried to implement the SIFT feature extractor from the digital images and Support Vector Machine (SVM), K-Nearest Neighbour (KNN), Random Forest as the classifier using MATLAB.

Methodology

Fig. 1. The methodology of Cirebon Mask Classification

Data Collection

The dataset was collected using two different ways: captured the Cirebon mask images using a camera and collected Cirebon mask images using a search engine application. This is done to explore more diversity in each class.

Fig. 2. Cirebon mask has 5 types like a Panji, Pamindo or Samba, Rumyang, Tumenggung, and Klana.
Characteristics of Cirebon Mask

Pre-Processing

In this step, the background images were removed manually, this makes the background black. After removing the background images, each image gets resized into 50 x 50 resolution.

Feature Extraction

After pre-processing to change the background and resizing the image. In this stage, the SIFT algorithm was implemented to detect key point features from the Cirebon mask.

Scale Invariant Feature Transform

The SIFT algorithm was introduced by Lowe. This method is invariant to image translation, partially invariant to affine projection and illumination change, scaling, and rotation. There are four steps for defining SIFT images:

  1. Scale Space Construction
  2. Keypoint Localization
  3. Orientation
  4. Keypoint Descriptor

You can see below about Matlab code for feature extraction using Scale Invariant Feature Transform method on my Github/fendy07. First, you must get all datasets and place your data model from the directory file. After, get all datasets to make a number class for initializing the data model. And then make a circle for keypoint SIFT with 32 x 32 x 32.

clear;
clc;
% Place your data model to directory file you want and get all data model
files_kelana=dir('C:\Users\User\Documents\MATLAB\SIFT\Data Image\Topeng Kelana\*.png');
files_tumenggung=dir('C:\Users\User\Documents\MATLAB\SIFT\Data Image\Topeng Tumenggung\*.png');
files_rumyang=dir('C:\Users\User\Documents\MATLAB\SIFT\Data Image\Topeng Rumyang\*.png');
files_samba=dir('C:\Users\User\Documents\MATLAB\SIFT\Data Image\Topeng Samba\*.png');
files_panji=dir('C:\Users\User\Documents\MATLAB\SIFT\Data Image\Topeng Panji\*.png');
% Make a number data class
n_kelana =numel(files_kelana);
n_tumenggung =numel(files_tumenggung);
n_rumyang =numel(files_rumyang);
n_samba =numel(files_samba);
n_panji =numel(files_panji);
class_kelana=1;
class_tumenggung=2;
class_rumyang=3;
class_samba=4;
class_panji=5;
% Make a circle for SIFT with 32
circle = [32 32 32];

After making a circle for SIFT keypoint 32 x 32 x 32. The next step is to make a feature extraction with the Scale Invariant Feature Transform method on each five class dataset Cirebon mask. At this stage, to determine the data class features with SIFT following circles created with size 32 x 32 x 32 key points.

%Klana feature with SIFT
for i=1:n_kelana
str = strcat('C:\Users\User\Documents\MATLAB\SIFT\Data Image\Topeng Kelana\', files_kelana(i).name);
temp_image=imread(str);
feature_sift_kelana(i,:) = find_sift(temp_image, circle);
feature_class_kelana(i,:)=[feature_sift_kelana(i,:) class_kelana];

end

%Tumenggung Feature
for i=1:n_tumenggung
str = strcat('C:\Users\User\Documents\MATLAB\SIFT\Data Image\Topeng Tumenggung\', files_tumenggung(i).name);
temp_image=imread(str);
feature_sift_tumenggung(i,:) = find_sift(temp_image, circle);
feature_class_tumenggung(i,:)=[feature_sift_tumenggung(i,:) class_tumenggung];

end

%Rumyang Feature
for i=1:n_rumyang
str = strcat('C:\Users\User\Documents\MATLAB\SIFT\Data Image\Topeng Rumyang\', files_rumyang(i).name);
temp_image=imread(str);
feature_sift_rumyang(i,:) = find_sift(temp_image, circle);
feature_class_rumyang(i,:)=[feature_sift_rumyang(i,:) class_rumyang];

end

%Pamindo or Samba Feature
for i=1:n_samba
str = strcat('C:\Users\User\Documents\MATLAB\SIFT\Data Image\Topeng Samba\', files_samba(i).name);
temp_image=imread(str);
feature_sift_samba(i,:) = find_sift(temp_image, circle);
feature_class_samba(i,:)=[feature_sift_samba(i,:) class_samba];

end


%Panji Feature
for i=1:n_panji
str = strcat('C:\Users\User\Documents\MATLAB\SIFT\Data Image\Topeng Panji\', files_panji(i).name);
temp_image=imread(str);
feature_sift_panji(i,:) = find_sift(temp_image, circle);
feature_class_panji(i,:)=[feature_sift_panji(i,:) class_panji];

end

After creating feature extraction to find the Scale Invariant Feature Transform method for five dataset classes in the Cirebon mask. In this stage, you must save feature data with a format .mat or Matlab file.

%Feature all data with SIFT and save result feature data feature_all = [feature_class_kelana; feature_class_tumenggung; feature_class_rumyang; feature_class_samba; feature_class_panji];
save('feature.mat', 'feature_all');

Classifier Theory

Feature classification was implemented using K-Nearest Neighbour, Random Forest, and Support Vector Machine as the classifier. This classifier created a model for the Cirebon Mask classification.

K-Nearest Neighbour

K-NN is a simple algorithm in supervised learning. The distance metric that is usually used in K-NN is Euclidean distance, but this article was implemented three different distance metrics. The distance metrics are:

Minkowski’s distance

Minkowski distance formula

where d(x, y) distance between each features data.

Euclidean’s distance

Euclidean distance formula

Chebyshev’s distance

Chebychev distance formula

where d(x, y) is the distance between features, r is parameters.

In the K-NN classifier, the query label was predicted based on the common class towards the k-closest points to X.

Fig. 3. An example of K-NN, where triangle represent k = 3 and rectangle
represent k=5.

Support Vector Machine

The other classifier, SVM is one of the prominent classifiers that can work in high-dimensional data and can use kernel function to map the real data into a higher dimension. In contrast to another classification method, SVM does not use all of the data to be learned in the learning process, but just several chosen data is contributed to building a model in the learning process. The simple algorithm for SVM:

  • Define two optimal hyperplane
  • Using penalty for misclassification, this is used for nonlinearly separable problem
  • Maximize the distance between two hyperplanes (the margin)
Fig. 4. Left: separating hyperplanes; Right: the maximum margin hyperplane

Random Forest

Random Forest is one of the ensemble algorithms that used the ”bagging” method for Decision Tree. Random Forest searching the best feature from the subset feature.

Fig. 5. Random Forest Scheme

Make a Model Classifier

In this step, after saving feature extraction for the dataset. Input file feature extraction SIFT method and load the data.

clear;
clc;

%load feature data after preprocessing with SIFT
load('feature.mat');

%get row and columns in feature data
[row, col] =size(feature_all);

label = feature_all(:,col);
feature = feature_all(:,1:col-1);

Next step, split the data with Cross-Validation

%Split the data with Cross Validation
[test,train] = crossvalind('HoldOut',label,0.7);


[row_train,col_train] =size(train);
temp_train=0
temp_test=0
for i=1:row_train
if train(i,:) == 1
temp_train=temp_train+1;
feature_train(temp_train,:) = feature(i,:);
label_train(temp_train,:) = label(i,:);

else
temp_test=temp_test+1;
feature_test(temp_test,:) = feature(i,:);
label_test(temp_test,:) = label(i,:);
end
end

After, split the data with cross-validation. Make a predicting model with the classifier method. First with the K-Nearest Neighbour method with distance metrics.

%make model classification with K-NN%knn
model_knn_chebychev = fitcknn(feature_train,label_train,'Distance','chebychev'); %pembuatan model kernel knn
model_knn_minkowski = fitcknn(feature_train,label_train, 'Distance','minkowski');
model_knn_euclidean = fitcknn(feature_train,label_train, 'Distance','euclidean');
loss_knn_chebychev = loss(model_knn_chebychev,feature_train, label_train, 'LossFun','hinge');
loss_knn_minskowski = loss(model_knn_minkowski,feature_train, label_train, 'LossFun','hinge');
loss_knn_euclidean = loss(model_knn_euclidean,feature_train, label_train, 'LossFun', 'hinge');
%testing process model predict for the data
%knn
%distance metric chebychev
result_knn_chebychev = predict(model_knn_chebychev,feature_test);
counter_knn_chebyshev = 0;
[row_test,col_test] = size(feature_test);
for i=1:row_test
if result_knn_chebychev(i)==label_test(i);
counter_knn_chebyshev=counter_knn_chebyshev+1;
end
end
accuracy_knn_chebychev=(counter_knn_chebyshev/row_test)*100;
conf_mat_knn_chebychev=confusionmat(label_test,result_knn_chebychev);

%knn
%distance metric minkowski
result_knn_minkowski = predict(model_knn_minkowski,feature_test);
counter_knn_minkowski = 0;
for i=1:row_test
if result_knn_minkowski(i)==label_test(i)
counter_knn_minkowski=counter_knn_minkowski+1;
end
end
accuracy_knn_minkowski=(counter_knn_minkowski/row_test)*100;
conf_mat_knn_minkowski=confusionmat(label_test,result_knn_minkowski);

%knn
%distance metric euclidean
result_knn_euclidean = predict(model_knn_euclidean,feature_test);
counter_knn_euclidean = 0;
for i=1:row_test
if result_knn_euclidean(i)==label_test(i)
counter_knn_euclidean=counter_knn_euclidean+1;
end
end
accuracy_knn_euclidean=(counter_knn_euclidean/row_test)*100;
conf_mat_knn_euclidean=confusionmat(label_test,result_knn_euclidean);

Next step, predicting model data using the Support Vector Machine method with kernel One vs One and One Vs All Classification.

%svm kernel
t_gaussian=templateSVM('KernelFunction','gaussian');
t_rbf=templateSVM('KernelFunction','rbf');
t_linear=templateSVM('KernelFunction','linear');

%onevsone
model_svm_onevsone_gaussian=fitcecoc(feature_train, label_train,'Learners',t_gaussian);
model_svm_onevsone_rbf=fitcecoc(feature_train, label_train,'Learners',t_rbf);
model_svm_onevsone_linear=fitcecoc(feature_train, label_train,'Learners',t_linear);

%onevsall
model_svm_onevsall_gaussian=fitcecoc(feature_train, label_train,'Learners',t_gaussian,'Coding','onevsall');
model_svm_onevsall_rbf=fitcecoc(feature_train, label_train,'Learners',t_rbf,'Coding','onevsall');
model_svm_onevsall_linear=fitcecoc(feature_train, label_train,'Learners',t_linear,'Coding','onevsall');
%svm
%distance metric One Vs One
%Gaussian
result_svm_onevsone_gaussian = predict(model_svm_onevsone_gaussian,feature_test);
counter_svm_onevsone_gaussian = 0;
for i=1:row_test
if result_svm_onevsone_gaussian(i)==label_test(i)
counter_svm_onevsone_gaussian=counter_svm_onevsone_gaussian+1;
end
end
accuracy_svm_onevsone_gaussian=(counter_svm_onevsone_gaussian/row_test)*100;
conf_mat_svm_onevsone_gaussian=confusionmat(label_test,result_svm_onevsone_gaussian);

%svm
%Distance Metric One Vs All
%Gaussian
result_svm_onevsall_gaussian = predict(model_svm_onevsall_gaussian,feature_test);
counter_svm_onevsall_gaussian = 0;
for i=1:row_test
if result_svm_onevsall_gaussian(i)==label_test(i)
counter_svm_onevsall_gaussian=counter_svm_onevsall_gaussian+1;
end
end
accuracy_svm_onevsall_gaussian=(counter_svm_onevsall_gaussian/row_test)*100;
conf_mat_svm_onevsall_gaussian=confusionmat(label_test,result_svm_onevsall_gaussian);

%SVM
%One vs All Linear
result_svm_onevsall_linear = predict(model_svm_onevsall_linear,feature_test);
counter_svm_onevsall_linear = 0;
for i=1:row_test
if result_svm_onevsall_linear(i)==label_test(i)
counter_svm_onevsall_linear=counter_svm_onevsall_linear+1;
end
end
accuracy_svm_onevsall_linear=(counter_svm_onevsall_linear/row_test)*100;
conf_mat_svm_onevsall_linear=confusionmat(label_test,result_svm_onevsall_linear);

%One vs One Linear
result_svm_onevsone_linear = predict(model_svm_onevsone_linear,feature_test);
counter_svm_onevsone_linear = 0;
for i=1:row_test
if result_svm_onevsone_linear(i)==label_test(i)
counter_svm_onevsone_linear=counter_svm_onevsone_linear+1;
end
end
accuracy_svm_onevsone_linear=(counter_svm_onevsone_linear/row_test)*100;
conf_mat_svm_onevsone_linear=confusionmat(label_test,result_svm_onevsone_linear);

%Radial Basis Function(RBF)
result_svm_onevsall_rbf = predict(model_svm_onevsall_rbf,feature_test);
counter_svm_onevsall_rbf = 0;
for i=1:row_test
if result_svm_onevsall_rbf(i)==label_test(i)
counter_svm_onevsall_rbf=counter_svm_onevsall_rbf+1;
end
end
accuracy_svm_onevsall_rbf=(counter_svm_onevsall_rbf/row_test)*100;
conf_mat_svm_onevsall_rbf=confusionmat(label_test,result_svm_onevsall_rbf);

%Radial Basis Function(RBF)
result_svm_onevsone_rbf = predict(model_svm_onevsone_rbf,feature_test);
counter_svm_onevsone_rbf = 0;
for i=1:row_test
if result_svm_onevsone_rbf(i)==label_test(i)
counter_svm_onevsone_rbf=counter_svm_onevsone_rbf+1;
end
end
accuracy_svm_onevsone_rbf=(counter_svm_onevsone_rbf/row_test)*100;
conf_mat_svm_onevsone_rbf=confusionmat(label_test,result_svm_onevsone_rbf);

Next step, predicting model data with the Random Forest method using Tree Bagger.

%random forest
model_randomforest = TreeBagger(100,feature_train, label_train);
%random forest
%distance metric random forest
result_randomforest_randforest = predict(model_randomforest,feature_test);
result_randomforest_randforest=str2num(cell2mat(result_randomforest_randforest));
counter_randomforest_randforest = 0;
[row_test,col_test] = size(feature_test);
for i=1:row_test
if result_randomforest_randforest(i)==label_test(i);
counter_randomforest_randforest=counter_randomforest_randforest+1;
end
end
accuracy_randomforest_randforest=(counter_randomforest_randforest/row_test)*100;
conf_mat_randomforest_randforest=confusionmat(label_test,result_randomforest_randforest);

Experiment and Result

Experiment Setting
The models were trained on CPU: Intel Core i5 and RAM: 4GB.

Experiment Scenario

The first scenario is using K-NN classifier with different distance metrics. The result of this scenario can be seen in Table II:

From this table, Minkowski and euclidean give the best performance compared to the Chebyshev distance metric. The result is not satisfying enough because of the overall accuracy of less than 60% so the model is not good enough.

Fig. 6. Accuracy using KNN

The second scenario uses SVM with two different approaches: one vs all and one vs one with three different kernels: Linear, Radial Basis Function, and Gaussian. The scenario can be seen in Table III:

From this table, I come up that the one vs all approach gives the best result compared to the one vs one approach. While the kernel function does not give any different performance in this approach the one vs one approach gives different accuracy using the kernel function.

Fig. 7. Accuracy using SVM

The last scenario uses Random Forest. In this method, we implemented three different amounts of trees: 10, 100, and 1000 trees. The result can be seen in Table IV:

From table IV, the depth of the tree is affecting accuracy. The accuracy improves after giving more depth to the tree.

Fig. 8. Accuracy using Random Forest

Based on the result, the K-NN classifier with different distance metrics does not give better results compared to other classifiers such as SVM and Random Forest. While SVM and Random Forest give the same accuracy toward our data and approaches.

Fig. 9. Accuracy each classifier

Conclusion

In this research, I used SIFT as the feature extraction and compared the feature with a lot of machine learning approaches such as K-NN, SVM, and Random Forest for the classification of the Cirebon Mask. Based on the accuracy, SVM and Random Forest give the best result compared to K-NN. In the latter approach, we wanted to use different feature extraction such as color or another texture approach and also use another machine learning approach to solve the Cirebon Mask classification.

References

D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision, vol. 60, 2004. [Online]. Available: http://dx.doi.org/10.1023/B:VISI.0000029664.99615.94

S. Hatko, “k-nearest neighbor classification of datasets with a family
of distances,” arXiv preprint arXiv:1512.00001, 2015.

R. Kamimura and O. Uchida, “Greedy network-growing by Minkowski distance functions,” in 2004 IEEE International Joint Conference on Neural Networks, Budapest, 2004.

T. Cover and P. Hart, “Nearest neighbor pattern classification,” IEEE Trans. Inf. Theor., vol. 13, no. 1, pp. 21–27, Jan. 1967. [Online]. Available: https://doi.org/10.1109/TIT.1967.1053964

T. Klve, T.-T. Lin, S.-C. Tsai, and W.-G. Tzeng, “Permutation arrays under the Chebyshev distance,” 2009.

S. Tutorial, “Svm — understanding the math — the optimal hyperplane.” [Online]. Available: https://www.svm-tutorial.com/2015/06/svmunderstanding-math-part-3/

L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, Oct 2001. [Online]. Available: https://doi.org/10.1023/A:1010933404324

--

--

Fendy Hendriyanto
The Startup

Passion in AI Computer Vision and Image Processing. Have a experience in IT and highly motivated to learn about Data Science and Content Writer.