Building high accuracy model with less images for coreML to detect facial emotion

Published in

YML Innovation Lab

5 min readMar 24, 2018

Machine learning uses statistical techniques to give computer system the ability to learn with given data and predict result on new testing data.

First, to start Machine learning, you need to create trained model. You can create model using any language. I preferred using python. In this article, I am going to explain how you can create dataset and ML model.

Step 1: Prepare image Dataset

As we are working on facial emotion detection, we need many images with different emotions of different people. There are certainly many emotions in the face. Lets consider some basic human facial emotions like happy, sad, angry, surprise, contempt, disgust, fear and neutral. In order to create a better model with highest accuracy, lot of images in each emotion category is necessary. You can use any online face dataset or you can collect few images from your surroundings if you are doing it for learning purpose. In this experiment I collected 15 different people images in the above facial emotions. Now we have collection of images in each category.

Step 2: Organize dataset

Each coloured image contains RGB values in each pixel. Each color will have value from 0–255. So each pixel in a colored image will have 3 channel values ranging from 0–255. If coloured images are considered, to create trained model with highest accuracy, large number of images and high efficient GPU is required.
If gray scale images are considered, each pixel will have value from 0–255. Compared to earlier case, a little burden can be reduced but this also needs lot of images and high efficient GPU.

So, to get high accuracy in model using a few images in each category, you need to extract certain important information from each image. As this article is only related to facial emotion detection, you can consider only face related data in each image.

Lets explore this now!!

Consider image1.

First, you need to detect face in the image. You can use opencv Haar cascade classifier to find exact face bounds.
Image needs to be cropped to the calculated face bounds.
To maintain consistency in the image dataset, all the images need to be scaled to same pixels.
To know the facial features, you need to detect face points in the image, you can use dlib library. The resulting image which contains 68 face points will be as shown in image 2.
To minimize data overhead, data can be reduced a little more by finding facial features according to facial action coding system. For basic human emotions, you can find 12 features like outerBrowRaiser, browLowerer, upperLidRaiser, cheekRaiser, lidTightener, noseWrinkler, lipCornerPuller, lipCornerDeppresser, lowerLipDeppresser, lipStretcher, lipTightener, jawDrop.

Now you have facial emotion data for a single image. Similarly, you need to calculate these 12 features for all the images. To know which image belongs to which category, you need to add one more feature which tells emotion type. All these data needs to be stored as 13 columns for each image in a csv file.

Step 3: Create python model

Now you need to create a machine learning model by using csv file data. Before starting to train, you need to divide the dataset into 2 parts, training dataset and testing dataset. Here, I am going to explain two methods in which we can create ML model.

SVM Classifier : SVM classifier is one of the best classifier for less dataset. It avoids overfitting of data in the equation. By choosing an appropriate parameters and type of kernel, SVM can be robust. You can use SVM classfier to create a model. First you need to divide the data as X, Y vectors. X contains face features and Y contains corresponding emotion type. There are different types of SVM classifier as given below; you can choose the perfect kernel depending which fits your dataset perfectly.

from sklearn import svmc = 1.0  # SVM regularization parameter# SVC with linear kernel
svc = svm.SVC(kernel = 'linear', C = c).fit(X, Y)# LinearSVC (linear kernel)
lin_svc = svm.LinearSVC(C = c).fit(X, Y)# SVC with RBF kernel
rbf_svc = svm.SVC(kernel = 'rbf', gamma = 0.01, C = c).fit(X, Y)# SVC with polynomial (degree 3) kernel
poly_svc = svm.SVC(kernel = 'poly', degree = 3, C = c).fit(X, Y)

You can check the accuracy of the classifier for different kernels using the following code.

predicted = svc.predict(X)# get the accuracy
print("\n%s: %.2f%%" % ('svc',accuracy_score(y, predicted)*100))

Using my training dataset, I got accuracies for different svm kernel as

svc: 96.97%

lin_svc: 46.97%

rbf_svc: 96.97%

poly_svc: 95.45%

Keras Deep Learning: You can also use Keras deep learning to the same dataset. Deep learning model gives more accuracy compared to svm classifier.

from keras.utils import np_utils
from sklearn.preprocessing import LabelEncoderX = data[:, 0:12]
Y = data[:,12]encoder = LabelEncoder()
encoder.fit(Y)
encoded_y = encoder.transform(Y)new_y = np_utils.to_categorical(encoded_y)model = Sequential()
model.add(Dense(512, input_dim=13, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(8, activation='sigmoid'))# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])# Fit the model
model.fit(X, new_y, epochs=200, batch_size=10)# evaluate the model
scores = model.evaluate(X, new_y)print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

Using my training dataset, deep learning model gave 99.24% accuracy.

Step 4: Convert python model to CoreML model

If you need to use this model in iOS application, you need to convert this model to .mlmodel using coremltools.

coreml_model = coremltools.converters.keras.convert(model)# Set model metadata
coreml_model.author = 'Dimple'
coreml_model.license = 'BSD'coreml_model.short_description = 'Predicts facial expressions'# Save 
modelcoreml_model.save('../FacialEmotionsDLKeras.mlmodel')

You can directly use this model in any iPhone application development to predict Human facial emotions. In order to use this model to predict emotion, you need to find facial features in the same way explained above. To know how to integrate model in iOS application in detail, stay tuned for the next article.

Building high accuracy model with less images for coreML to detect facial emotion

Written by Dimple R