Facial Emotion Recognition: A Deep Learning approach with Keras

Düzgün İlaslan
6 min readAug 6, 2023

--

source

Although understanding and describing the expressions on the human face seems like a human skill, these days it seems to be solved with these deep learning algorithms.

We will try to prove this in our work today.

Actually, this work we will do will be solved by image classification methods. In my previous article, I have examined this issue in detail. If you want to learn about image classification, you can go here.

Since I made detailed explanations about image classification in the above article, I do not think to include too many technical topics in this blog. I will focus more on the implementation part.

So, what can the emotion recognition model that will develop, or where can it be used?

Although emotion recognition seems to have an important place in human relations, these issues are very important in business life, advertising policies of companies or job interviews etc.

The model that we will develop here will be very important to be used in this or other areas.

let’s put these into practice

These are the versions installed in Environment;
keras 2.12.0
matplotlib 3.7.1
Python 3.9.12
numpy 1.25.0

About Dataset

The data consists of 48x48 pixel grayscale images of faces. The faces have been automatically registered so that the face is more or less centred and occupies about the same amount of space in each image.

The task is to categorize each face based on the emotion shown in the facial expression into one of seven categories (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral). The training set consists of 28,709 examples and the public test set consists of 3,589 examples.

Now, import the all necessary libraries

from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense,Dropout,Flatten
from keras.layers import Conv2D,MaxPooling2D,BatchNormalization,Activation
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
import os
from matplotlib import pyplot as plt
import numpy as np

Determine image and batch features

#image height
IMG_HEIGHT=48
#image width
IMG_WIDTH = 48
#batch size
batch_size=32

data set directory

so,

train_data_dir='data/train/'
validation_data_dir='data/test/'

We will move on to the preprocessing of the image, Augmentation will be done with the Image Generator.

#for train data set
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=30,
shear_range=0.3,
zoom_range=0.3,
horizontal_flip=True,
fill_mode='nearest')

There are a few parameters set here and we will make them happen. These;
Random Rotations:
allows you to randomly rotate images through any degree between 0 and 360
Random Flips:
flipping along the vertical or the horizontal axis. However, this technique should be according to the object in the image.
Random Zoom:
The zoom augmentation either randomly zooms in on the image or zooms out of the image.
Fill Mode:
default value is “nearest” which simply replaces the empty area with the nearest pixel values.

#for validation dataset
validation_datagen = ImageDataGenerator(rescale=1./255)

After create two method, initialize that methods

# create training sets
train_generator = train_datagen.flow_from_directory(
train_data_dir,
color_mode='grayscale',
target_size=(IMG_HEIGHT, IMG_WIDTH),
batch_size=batch_size,
class_mode='categorical',
shuffle=True)
#create Validation sets
validation_generator = validation_datagen.flow_from_directory(
validation_data_dir,
color_mode='grayscale',
target_size=(IMG_HEIGHT, IMG_WIDTH),
batch_size=batch_size,
class_mode='categorical',
shuffle=True)

Let’s check our dataset before moving on to the model training part.

create a list that include all emotions

#emotions list
class_labels=['Angry','Disgust', 'Fear', 'Happy','Neutral','Sad','Surprise']

We will take the sample image from the train dataset.

img, label = train_generator.__next__()
#import random module
import random
#create a number
i=random.randint(0, (img.shape[0])-1)
#get image
image = img[i]
#get image label
labl = class_labels[label[i].argmax()]
#ploting
plt.imshow(image[:,:,0], cmap='gray')
plt.title(labl)
plt.show()

Now, create the model that is contain input shape (48x48)

Model detail is ;

model = Sequential()
#1st CNN layer
model.add(Conv2D(64,(3,3),padding = 'same',input_shape = (48,48,1)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Dropout(0.25))

#2nd CNN layer
model.add(Conv2D(128,(5,5),padding = 'same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Dropout (0.25))

#3rd CNN layer
model.add(Conv2D(512,(3,3),padding = 'same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Dropout (0.25))

#4th CNN layer
model.add(Conv2D(512,(3,3), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())

#Fully connected 1st layer
model.add(Dense(256))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))


# Fully connected layer 2nd layer
model.add(Dense(512))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.25))


model.add(Dense(7, activation='softmax'))
#get model summary
model.summary()

Determine model hiperparameters; metrics is “accuracy”, optimizer is “adam” with learning rate 0.0001 and loss function is “categorical crossentropy”

#compile the model
model.compile(optimizer = Adam(lr=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

get number of train and test images

train_path = "data/train/"
test_path = "data/test"

num_train_imgs = 0
for root, dirs, files in os.walk(train_path):
num_train_imgs += len(files)

num_test_imgs = 0
for root, dirs, files in os.walk(test_path):
num_test_imgs += len(files)

print("number of test images: ",num_train_imgs)
print("number of train images: ",num_train_imgs)

>> number of test images: 28710
>> number of train images: 28710

Declare the model mechanism such as early stoping and reduce learning rate.

# determine early stoping mechanism
early_stopping = EarlyStopping(monitor='val_loss',
min_delta=0,
patience=3,
verbose=1,
restore_best_weights=True
)
# reduce learning rate mechanism
reduce_learningrate = ReduceLROnPlateau(monitor='val_loss',
factor=0.2,
patience=3,
verbose=1,
min_delta=0.0001)

# make callback as a list
callbacks_list = [early_stopping,checkpoint,reduce_learningrate]

okey let’s train the model

#epoch size
epochs=50

# fit model
history=model.fit(train_generator,
steps_per_epoch=num_train_imgs//train_generator.batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=num_test_imgs//validation_generator.batch_size,
callbacks=callbacks_list)

During the model training, the model stopped without waiting for the training to end, thanks to the early stopping mechanism.

After that check the model score

#plot thame
plt.style.use('dark_background')
#figure size
plt.figure(figsize=(20,10))
#make subplot
plt.subplot(1, 2, 1)
plt.suptitle('Optimizer : Adam', fontsize=10)
plt.ylabel('Loss', fontsize=16)
#get training loss
plt.plot(history.history['loss'], label='Training Loss')
#get validation loss
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.legend(loc='upper right')

plt.subplot(1, 2, 2)
plt.ylabel('Accuracy', fontsize=16)
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.legend(loc='lower right')
plt.show()

okey after all thesee stages save the our model

model.save('emotion_detection_model.h5')

Test The Model

First of all load model using Keras “load_model” lib

from keras.models import load_model

using that method

#load the model
my_model = load_model('emotion_detection_model.h5', compile=False)

get one random images and predict it

#Generate a batch of images
test_img, test_lbl = validation_generator.__next__()
#prediction thist test images
predictions=my_model.predict(test_img)

using np.argmax and convert predictions

predictions = np.argmax(predictions, axis=1)
test_labels = np.argmax(test_lbl, axis=1)

Let’s check the accuracy of model using sklearn accuracy_score method

from sklearn import metrics
print ("Accuracy = ", metrics.accuracy_score(test_labels, predictions))

>> Accuracy = 0.65625

The accuracy model is 66%. This is not bad it is almost good.

Now, check the model predictions

class_labels=['Angry','Disgust', 'Fear', 'Happy','Neutral','Sad','Surprise']
#Check results on a few select images
n=random.randint(0, test_img.shape[0] - 1)
image = test_img[n]
orig_labl = class_labels[test_labels[n]]
pred_labl = class_labels[predictions[n]]
plt.imshow(image[:,:,0], cmap='gray')
plt.title("Original label is:"+orig_labl+" Predicted is: "+ pred_labl)
plt.show()

As you can see, the original image is labeled angry and the model’s prediction correctly predicted angry.

In summary, we have completed the recognition of human facial expressions with a deep learning model. This model, of course, is not the best predictor model. Based on these things, you can help the model make a better estimation.

--

--