Emotion Recognition Using Keras

CHAUDHARI AMOL MOHAMMAD
10 min readNov 20, 2019

In this article we will see how to implement convolution neural network(CNN) sufficient enough to predict emotion and facial expressions not only in images but also in video stream as well as we will do deep analysis on each and every snippet of code then we will apply our pre-trained CNN to recognize dominant facial expressions in real time.

(Result I got when I applied my pre-trained CNN on couple of images):

BHAI…..is ANGRY!!!😡😡

Why restrict our self to detect emotion of single dominant face in an image,When we can detect facial expressions of multiple faces ….😎.

Sorry Alladin……😜.

Without even going through this article you can do Emotion detection in real time on your system.Only thing you need to do is go to my github repository and follow some very simple step which I mentioned there.Link to my github repository is given below at last.

Note:

1. In this article I will try to explain some snippet of code from my repositories which I think will be very useful to those of you who want to build their own CNN model in keras and improve existing accuracy. I will also share result of my experiments which I did before achieving best accuracy.

2. To get most benefit from this article I suggest you to go through my repositories first and then come back here.

Let’s get started:

We will approach step by step :

  1. We will build dataset.
  2. We will build our CNN model.
  3. Train the model.
  4. Evaluate perfomance in real time.

1. Building Dataset:

We will trained our model on facial expression dataset (FER13 dataset) .Download FER13 dataset from :

After downloading the dataset, you’ll find a .CSV file fer2013.csv with three columns:

  1. emotion.
  2. pixels(A flattened list ).
  3. usage(Whether the image is for Training, PrivateTest (validation), or PublicTest(testing)).

The training set consists of 28,709 examples. The public test set used for the leader board consists of 3,589 examples. The final test set, which was used to determine the winner of the kaggle competition, consists of another 3,589 examples.

Here our goal is to take this .csv file and convert it to HDF5 format so that we can trained our CNN model more easily on top of it :

HDF5 is binary data format ,it is an open source file format for storing huge amounts of numerical data on disk .It give very easy access and allow us to do computation on the rows of the datasets. We can store huge amounts of data in our HDF5 dataset and manipulate the data in a NumPy-like fashion.

In above snippet we have read our csv file and initialize list of data and labels for training ,testing, and validation sets .

From line 2 to 31:

It is 7 class problem ( angry, disgust, fear, happy, sad, surprise, and neutral) after doing exploratory data analysis(code is there in my github repositories) ….I realized that data is highly imbalance what I meant to say is we have very less data points belong to “disgust” class so in above snippet of code we combine ‘angry’ and ‘disgust’ class.

At this point our image is just a string of integers,we need to take this string,split it into a list,convert it to an unsigned 8-bit integer data type, and reshape it to a 48X48 grayscale image. Then check the usage and assign the image and label to the respective training, validation, or testing lists.

In very first line we have created list pairing train,test and validation images along with their labels and hdf5 path then we loop over this list(datasets) and call method “add” from class HDF5DatasetWriter ,it will create three hdf5 file for you.

wait ….wait…..wait….what is class HDF5DatasetWriter?😒

It is python class which will take raw images as input and write them as hdf5 format.

dims : dimension or shape of data we will be storing in dataset.

outpath: path to where our output HDF5 file will be stored on disk.

Lines 6–8 will check to see if outputPath already exists. If it does, we raise an error to the end user (as we don’t want to overwrite an existing database).

In Line 11 opens the HDF5 file for writing using the supplied outputPath. Lines 12 create a dataset with the dataKey name and the supplied dims — this is where we will store our raw images. Lines 13 create a second dataset, this one to store the (integer) class labels for each record in the dataset .

In Line 16 and 17 we initialize buffer ,once we reached buffsize we will flush buffer to HDF5 dataset.

After this add two function in same class named as ‘add’ and ‘’flush:

The add method requires two parameters: the rows that we’ll be adding to the dataset, along with their corresponding class labels. Both the rows and labels are added to their respective buffers(first three lines of add method) . If the buffer fills up, we call the flush method to write the buffers to file and reset them.

Very important point in ‘flush’ method is that we need to keep track of the current index into the next available row where we can store data (without overwriting existing data) that is exactly we are doing it in first three lines of flush method.

If you run file build_dataset.py it will create following file on your system:

2.Our Model Architecture:

The network we are going to implement is inspired by the family of VGG networks.

VGGNet was first introduced by Simonyan and Zisserman in their 2014 paper, Very Deep Learning Convolutional Neural Networks for Large-Scale Image Recognition .Before 2014 in most of CNN architecture people use some sort of common trend like number of filter progressively reduces. VGGNet is unique in that it uses 3X3 kernels throughout the entire architecture. The use of these small kernels make VGGNet generalize to classification problems outside what the network was originally trained on .Any time you see a network architecture that consists entirely of 3X3 filters, you can rest assured that it was inspired by VGGNet.

Our network architecture:

Note: After every CONV layer, we will apply an activation followed by a batch normalization .

3. Train the Model:

we will use keras fit_generator function to train our model,this function accept batch of data and performs backpropagation, and updates the weights in our model.

Keras is using the following process when training a model with .fit_generator :

  1. Keras calls the generator function supplied to .fit_generator (in this case, we are generating batches of images and class labels using method name ‘generator’ which is defined in class HDF5DatasetGenerator).
  2. The generator function yields a batch of size (BS) to the .fit_generator function.
  3. The .fit_generator function accepts the batch of data, performs backpropagation, and updates the weights in our model.
  4. This process is repeated until we have reached the desired number of epochs.

Let’s see content of class ‘HDF5DatasetGenerator’ ,in it we will define method ‘generator’ which is responsible for yielding batches of images and class labels to the Keras .fit_generator function when training a network.

too….many arguments……😢😢😢

dbPath: The path to our HDF5 dataset that stores our images and corresponding class labels. (we will convert our train,test,validation data(images) in to hdf5 format and store it on disk, dbPath is path to that HDF5 dataset)

batchSize: The size of mini-batches to yield when training our network.

preprocessors: In simple words it will simply take care of image_data_format .

aug: Defaulting to None, but we will supply a Keras ImageDataGenerator to apply data augmentation directly inside our HDF5DatasetGenerator.

wait….wait……wait……. ‘aug’ this argument is little bit tricky ….let’s go in little bit of detail.

You may have noticed that we stored these images as raw, unnormalized RGB images, meaning that pixel values are allowed to exist in the range [0;255]. However, it’s common practice to perform mean normalization .Luckily, ImageDataGenerator class provided by Keras can automatically perform this scaling for us. We simply need to provide a rescale value of 1/255 then every image will be multiple by this ratio, thus scaling the pixels downto [0;1].

In nutshell what above two snippet of code doing is ….first with help of Keras ImageDataGenerator we are scaling down our images ,then we will perform data augmentation on these image with the help of function called ‘generator’ which we defined in class HDF5DatasetGenerator which is yielding images in batches to .fit_generator.

When you open train_model.py file you will see these above mentioned classes and methods ,I have tried my level best to explain what is the function of .fit_generator and how I am doing data augmentation to understand it in more detail ,I strongly suggest you to refer blog by Adrian Rosebrock (link you can find at last).

4. Evaluate Model Perfomance :

ZERO Experiment:(My first Model parameters)

I have decreases learning rate after every 20 epoch to see whether I can increase accuracy or decrease loss.Decreasing learning rate from (10^-2)to (10^-3)is practically unnoticeable,with these order of magnitude drops we would expect to see at least some rise in accuracy and a corresponding drop in loss.

In last 40 epoch model has not learned anything given that SGD led to stagnation in learning when dropping the learning rate,so I switch to ADAM .

I have trained my model with an Adam optimizers and activation function as ReLU and I have used learning rate of ‘1e-3’ for first 20 epoch and stop training result up to this checkpoint:

As you can see ….both losses are decreasing and both accuracy increasing and almost equal to 60%.

Note: When training your own neural networks, we have to pay attention to our loss and accuracy curves for both the training data and validation data.

During the first few epochs, it may seem that a neural network is tracking well, perhaps underfitting slightly — but this pattern can change quickly, and you might start to see a divergence in training and validation loss.

Therefore I have created a TrainingMonitor callback(code is there in my repositories) that will be called at the end of every epoch (in my case I have called it after every 5 epoch)when training a network with Keras. This monitor will serialize the loss and accuracy for both the training and validation set to disk, followed by constructing a plot of the data.

Now let’s train the model for another 20 epoch with same learning rate of 1e-3.

Thumb rule while training any Deep learning model:

1.Reduce the training loss as much as possible.

2. While ensuring the gap between the training and val loss/accuracy is reasonably small.

Considering above two points we are on right track…..if you follow same step which I mentioned in my repository to check perfomance on test data….I got accuracy on test data is equal to 64.43%.

Let’s see whether we can increase model perfomance further……

Now let’s decrease learning rate to 1e-5 and train model for 25 epoch .

After 65 epoch …..I have got test accuracy of almost 65.93% …this accuracy is enough to secured rank in top 5 spot in kaggle leadership board.

Again let’s try to improve accuracy……

I decrease learning rate to 1e-9 and trained model for another 25 epoch and got accuracy of almost 66.13%…..that was my best. After that I did several changes like changing Activation,optimizers played with learning rate but result was almost the same.

I have clearly explained directory structure in my repositories and in which order you should run python script .

At last I would like to mentioned that content of this case study and most part of the I code I learned from Adrian Rosebrock blog’s PyimagesSearch Guru course . There indepth explanation really help me to not only solve this case study but also helped me to achieve result almost equivalent enough so that one can claim top 5 spot in Kaggle leaderboard.

My github Repository link: https://github.com/Amol2709/EMOTION-RECOGITION-USING-KERAS

Refrences:

--

--