Read FER2013 / Face Expression Recognition dataset using Pytorch Torchvision

Hervind Philipe

Published in

Analytics Vidhya

3 min readFeb 17, 2020

One of the popular Artificial Intelligence (AI) use cases is to detect humans' emotions/expression whether their face depicts happy expression, sadness, disgust, etc.

There tons and various applications from that AI from market research to human safety, but we are not going to discuss them, let's talk more technical.

How the “AI” can recognize our emotion is simply image classification behind it, just like classify a hot dog and not a hot dog.

The commonly used dataset for this image classification is FER2013 / Face Expression Recognition which prepared by Pierre-Luc Carrier and Aaron Courville, as part of an ongoing research project (Kaggle said).

You can access and download the database on the link below:

Challenges in Representation Learning: Facial Expression Recognition Challenge

Learn facial expressions from an image

www.kaggle.com

The dataset contains 35,887 grayscale images of faces with 48*48 pixels. There are 7 categories: Angry, Disgust, Fear, Happy, Sad, Surprise, and Neutral.

The thing is, these “images” are stored in CSV format, YESS CSV FORMAT!

Similar to tabular data stored in CSV format, the first row is the name of the “columns”: emotion, pixels, usage.

Then each of the rest 35,887 rows contains emotion index (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral), 2304 integer which is the grayscale intensity of associated pixel to 48x48 image (2304 = 48*48) but those 2304 integers are separated by space, the usage whether it is for training or public test or private test.

Back to what written in the title. We are going to read the dataset using the Torchvision package.

I will provide two kinds of ways to extract it.

This is the first one:

And the second:

To use it call the class as an object and iterate the object, for example

dataset = FER2013Dataset_Alternative(fer_path)
dataset[1000] # RETURN IMAGE and EMOTION of row 1000

The difference is, the second method load all rows in the CSV at the initialization state, while the first is load one row from the CSV when needed.

The second method is way faster but it requires a lot of memory to store all the CSV data around 301 MB, Which relatively small, but this method is not recommended (or even not possible) for a huge dataset.

Hope this article is beneficial for you,

CHEERS!!!!