Face Mask 😷 Detection Using Deep Neural Networks (PyTorch)

Saurabh Palhade
GDSC DYPCOE
Published in
5 min readJun 30, 2020
Photo by Anastasiia Chepinska on Unsplash

ABOUT THE PROJECT

Face Masks play a vital role in protecting the health of individuals against respiratory diseases, and is one of the few precautions available for COVID-19 in the absence of a vaccine. And now everywhere it’s compulsory to wear masks. So hear i created a model which detects whether a person in the image have a face mask or not.In this article i will explain you about my approach towards making this project and how i created it from scratch.

ABOUT THE DATASET

The dataset contains around 12000 images of different sizes belonging to the 2 classes(with mask and without masks) .The dataset is available on kaggle,you can find it here.

Training set size : 10,000 images

Validation set size : 800 images

Test set size : 992 images

Lets get started….

PREPARING THE DATA

We begin by importing all the required libraries and functions.

We have imported os and tarfile for extracting data from dataset directories, ImageFolder to load the data from different folders as pytorch tensors, random_split and DataLoader to load data into batches for training and validation, matplotlib and make_grid for data visualization purpose, torch.nn package from PyTorch, which contains utility classes for building neural networks.

Lets take a look at data directories and classes and load the data as pytorch tensors

Training Validation and Test Dataset

As our dataset contains images of different sizes so we will apply some transformations on the training,validation and test set to make all the images of same size(most of the computer vision applications need images of same size).For that we will use PyTorch inbuilt functions and we will resize all the images to 224x224 pixels and then we will convert them into pytorch tensors.

Dataloaders

Now we will create dataloaders for splitting data into batches for training and validation purpose.Dataloaders returns the dataset batch by batch of predefined batch size,instead of loading whole data at once we use dataloaders to load our dataset batch by batch so that we don’t run out of memory and to prevent our training process from slow down.Here we will use a batch size of 128.

Let’s take a look at a batch of data from the training dataset,for that we have to create a helper function show_batch .

As our dataset is huge so we need GPU to train our model within a reasonable amount of time,so for that purpose we define a helper function for selecting the available device(cpu or gpu).This function checks if a GPU is available and the required NVIDIA CUDA drivers are installed and returns a available device.

Now lets define few helper functions and class to move our training ,validation and test data on the available device(cpu or gpu).

DEFINING THE MODEL

To include some additional functionalities within our model, we need to define a custom model by extending the nn.module class from PyTorch. nn.module class is basically a base class for all neural network modules.Before defining our model let’s create some helper functions.

Here we have class Dnn in which we have training_step function which calculates loss of training data,validation_step which calculates loss and accuracy of validation data,validation_epoch_end and epoch_end function which calculates and prints validation loss and accuracy after every epoch(iteration). Note here we have used loss function as cross entropy which performs better on these kind of problems.you can check more about it here .

Now we will extend the Dnn class to complete the model definition.

Here we have created a neural network with 4 layers.one input layer,two hidden layers and a output layer. For creating each layer of our neural network we have used a inbuilt nn.linear function from pytorch which takes two arguments input size(features) and output size(no of classes or no of output labels).size of the hidden layers are 128 and 256 respectively,and the activation function which we have used here is ReLU(rectified linear unit).it has a simple formula: relu(x) = max(0,x) i.e. if an element is negative, we replace it by 0, otherwise we leave it unchanged.Introducing a hidden layer and an activation function allows the model to learn more complex, multi-layered and non-linear relationships between the inputs and the targets(outputs or labels).

TRAINING MODEL

Before we actually begin with the model training process,we need to define some helper functions,for model training and evaluation purpose.

We have define an accuracy function which calculates the overall accuracy of the model on an entire batch of outputs, so that we can use it as a metric in fit function. evaluate function for calculating loss and accuracy of validation data after every epoch. and the most important utility function fit which trains the model for a given number of epochs. It basically preforms the following operations: 1- Generates predictions, 2- Calculate the loss , 3-Compute gradients w.r.t the weights and biases, 4-Adjust the weights by subtracting a small quantity proportional to the gradient, 5-Reset the gradients to zero and at the end of every epoch it evaluates the model on validation data and prints it’s loss and accuracy.Note that the optimization algorithm we have used here is stochastic gradient descent (optim.SGD).you can learn more about it here .

Before we train the model, we need to ensure that the data and the model’s parameters (weights and biases) are on the same device (CPU or GPU). We can reuse the to_device function to move the model's parameters to the right device(cpu or gpu).

Let’s initialize our model and move it on the available device.

Now we are ready to train our model.let’s train it for 10 epochs with the initial learning rate of 0.01.

After this I trained the model for few more epochs with learning rates 0.001 and 0.0001,and after evaluating it on the test data i was able to achieve the accuracy of 95.37%.

Now let’s test our model on the predefined test data and see how it performs.

Well you can see we got a test accuracy 95.37% which is pretty good and our model is performing quite well. check out the entire notebook of this project here.

If you are reading this i hope this article helped you to gain some knowledge about creating end to end model using neural networks and encouraged you to learn more about it.

Thank you so much for taking the time to read this! I hope you enjoyed reading it .you can connect with me on LinkedIn and Twitter

REFERENCE LINKS

Check out the playlist of the course “Deep Learning With PyTorch” here

Check out jovian.ml, It’s a sharing and collaboration platform for data science projects and jupyter notebooks.

Check out my other projects here .

--

--