Week 3# Identification of Artists and Movements from Paintings with Machine Learning

Published in

BBM406 Spring 2021 Projects

2 min readMay 2, 2021

This week, we have obtained our first test results with Naive Bayes classifier to select the best-fit image features.

Selecting Best-fit Features

To start using complex models for painting’s classification, we made benefit of Naive Bayes classifier in selecting the best-fit features for our dataset. Three image features has been tested thus far: Hu Moments, GIST, color histogram. Each feature has been tested individually at first, in combinations afterwards. Unscaled image features that we extracted may cause misclassification since features with broad range governs other features. However, classifiers that are not based on distance are not effected by feature scaling and Naive Bayes classifier is one these algorithms.

Hu Moments

Hu Moments are well-known image features that can be used to define, classify, measure the form of an object in a picture, and can be calculated from the outline of an object in an image.

Hu Moments ( or rather Hu moment invariants ) are a set of 7 numbers calculated using central moments that are invariant to image transformations. The first 6 moments have been proved to be invariant to translation, scale, and rotation, and reflection. While the 7th moment’s sign changes for image reflection. — Hu Moments / OpenCV

Color Histogram

Color histogram is used to depict the color distribution in an image. It displays the color bin frequency distribution and keeps track of identical pixels and stores them.

https://www.pyimagesearch.com/2014/01/22/clever-girl-a-guide-to-utilizing-color-histograms-for-computer-vision-and-image-search-engines/ — Color histogram of the image of Dr. Grant

GIST

GIST descriptor, sums up the gradient details for various sections of an image, resulting in a rough description of the scene which helps to characterize an the image with significant statistics.

Test Results

In the light of test results that we achieved with the help of Naive Bayes classifier, classification accuracy on our dataset varies between 70%–90%. That variation may caused by the inbalance in data classes. It is accomplishable to reduce missclassification rate through using data augmentation with different procedures: cropping, rotation, random erasing.

Even though feature scaling has not been used throughout the initial phase, it will be applied to our features in the process of deploying state-of-the-art models.

Stay tuned…