This is a beginner-friendly project, with four different approaches to the same problem to show how with every approach, a model becomes more efficient/deeper. I have used the fer2013 data-set to recognise the expression on the image, which you can see in the image shown. You will see how these models are structured differently and how they make a difference in the results.
Here’s the list of models covered, and you can find the links to those notebooks right beside their names:
For this project, I have used the ‘fer2013’ dataset. You can find it available here. This dataset consists of 35,887 data entries in CSV file format. You can view the data after converting it into a Pandas DataFrame as shown below:
The data is divided into three categories by USAGE (Training, Validation and Test sets) and into seven categories by EMOTION ie our LABELS (‘angry’, ‘disgust’, ‘fear’, ‘happy’, ‘sad’, ‘surprise’, ‘neutral’). I have created three other data frames as per my requirement, ie, train_df, valid_df, and test_df. The pixels given can be converted into an image by imposing the required transitions. I made a function to do the same. You can find it in the notebook.
Let’s further explore our dataset:
Because our dataset is present in the form of a data frame and not as a library in PyTorch, I have created a class ‘expressions’ to take an input data frame and output the image into a Tensor and a label. Now, we have a data type that consists of two variables:
- Tensor (containing 48x48 grayscale images) and
- image label
We will have to import transforms from torchvision to be able to use transformations on our dataset. These transforms are important for image processing. Output images can now be used for analyzing or interpreting further. Even otherwise, PyTorch does not work with images directly, we convert the images into Tensors. TorchVision contains helper classes/utilities to work with image data.
Our expression datasets are of the type:
“from torch.utlis.data import DataLoader”
DataLoaders can split the data into batches of a predefined size while training. This is very important if we are dealing with millions of data. We can mention the batch size first, like here I made the batch_size = 400, so a batch of 400 will be loaded into the model at a time.
Let's have a look at a batch of our data,
I think this looks amazing, especially considering how much time it took for me to get this output right! Phew!
Training on GPUs:
GPUs are a specialized processor unit with dedicated memory, a single-chip processor used for extensive Graphical and mathematical computation hence freeing the CPU. GPUs are required to reduce the training time because, with an increase in data, the training time will increase. In PyTorch, we check the availability of a GPU using torch.cuda.is_available(). To use a GPU, we have to shift our entire model and our data in GPU memory. For this, I have created a function and a class.
Creating the Logistic Regression Model:
Logistic Regression is a statistical and ML technique used to model the probability of a certain class/event, ie, to classify records of a dataset based on the values of the input field. In Logistic Regression, we use one or more independent variables to predict an output with a Boolean output. But it can be used for both Binary and Multiclass Classification.
We have used this as our starting/base model and we will advance toward deeper models. You can find out more about Logistic Regression in this notebook by Aakash NS, here. This notebook has covered the Mnist Data-set.
Training: Our Objective is to change the parameters of the model so as to be the best estimation of the labels of the samples in the dataset. In our training process, we look at the cost function or lost function and see what the relation is between the cost function and the parameters θ, so we should formulate the cost function.
Accuracy is a good evaluation method for Classification but, Its not a good loss function. Here’s why,
Hence we use Cross-Entropy as our Loss Function which is continuous and differentiable which provides us good feedback for incredible improvements.
The fit function below will train our model on the basis of the mentioned hyperparameters.
After 25 epochs, our model’s final accuracy comes out to bs 31% approx.
You will be able to see when you run the notebook that the accuracy does not cross this certain limit. I have plotted a graph of the accuracy below:
Predicting some Outputs:
I made a predict function for predicting our model's accuracy on the test dataset.
Hey, Look at that!! One prediction is correct. yay!
FeedForward Neural Network:
In a neural network as the name suggests we have an artificial neural network wherein connections between nodes do not form a cycle.
Due to the nonlinearity in these hidden neurons, the output of an artificial neural network is a nonlinear function of the inputs. In a classification context, this means that the decision boundary can be nonlinear as well, making the model more flexible compared to logistic regression. Although higher flexibility may be desirable in general, it carries with it a higher risk for model overfitting (“memorizing the training cases”), which can potentially reduce a model’s accuracy on previously unseen cases. This is where we add data transformations that can create variations in the training data. Example RandomCrop, RandomHorizontalFlip, etc..
You can find a good comparison between logistic regression models and artificial neural networks here under topic 3.
FNN - Model:
We will shift the model to GPU, use the same fit and evaluation function we mentioned above. Let's see the accuracy before training.
Let's begin training:
The accuracy wasn't moving ahead which is why I only trained for 25 epochs in total. Let's see how our accuracy and loss changes with training.
We can see that the loss has effectively decreased with training. Let's check Final Accuracy on Test DataLoader and predict some images from Test Dataset.
To not make this too long, I have continued the article HERE. I hope this is of good use to you.
You can find me on LinkedIn and reach out to me there.