Not so long ago, age prediction applications were quite trending among the iOS phone users. In this post, we will create a Deep Learning model that predicts the age of a person based on their facial image.
Let’s get started….!!
We will be using Fastai2 library for this model. It contains all the sub-libraries needed for NLP, Recommendation systems and computer vision. For this computer vision task, we will use the vision sub-library.
!pip install fastai2 -q
from fastai2.vision.all import *
from fastai2.basics import *
We need a dataset that contains facial images along with their age. Our model should extract the features of these images by passing it through several layers of matrix multiplications and output a number(i.e. age), which is called the ‘Forward pass’. Now, this predicted age should be compared with the actual age to calculate loss and go back and change the values(weights) of our matrices and this, not so surprising, is called ‘Backward pass’.
Let’s get the data from Kaggle(link below) and upload it to our Google Drive.
Once uploaded to Drive, start the Google Colaboratory environment, connect to a runtime and mount the Drive. We need a path to where our data is stored.
In this dataset, there are 99 folders named by the corresponding age of the people’s images in that folder. So the output, y value, is the label of the folder.
Now we shall create a get_y function which gets the name of the folder and converts it into an integer to do the regression. The Pipeline is particular to Fastai and it facilitates the processes to happen in a sequence.
def to_num(x:str): return int(x)
We’ll use the DataBlock API to get the data, apply transformations, augmentations, split them into training and validation sets, get the y-value and normalise.
batch_tfms=[*aug_transforms(size=224, max_warp=0, max_rotate=7.0, max_zoom=1.0)]
Now, let’s make this data block into a data loader with a batch size of 64.
Let’s create a default CNN learner using the cnn_learner() function and let’s use resnet18 architecture. As this is a regression problem, it is mandatory to specify the y_range.
Now let’s train the model, i.e. fit, for 5 epochs and make a prediction on an image.
The output age I got was 50.8 and of course it might vary each time you run the model. It is because of the randomness in parameter initialisation.
That, by all means, is a fairly good result, yet still, we’ll hack into the cnn_learner() and customise it with a new architecture, activation function, self-attention and optimiser. If we look inside the learner.py notebook in Fastai’s Github repo, what cnn_leaner() does is it creates a cnn_model() and passes the model into a Learner() and the cnn_model() call the create_head() and create_body() to create the model.
We shall copy the create_body() function and make some changes so that we can pass in an activation function and self-attention. (changes made in the respective lines are shown below)
def create_custom_body(arch, n_in=3, pretrained=True,act_cls=nn.ReLU(),sa=False, cut=None):
model = arch(pretrained=pretrained,act_cls=act_cls,sa=sa)
Now, we shall use xresnet18's architecture, use Mish as activation and set self-attention as True.
body=create_custom_body(xresnet18, pretrained=True, act_cls=Mish, sa=True)
To create the head we need to find the number of outputs from body and number of outputs from the head. We’ve to double the number of input features to the head because our head will contain average pooling and max-pooling layers. As we are doing a regression we have to set a y_range (i.e. output boundaries)
nf=num_features_model(nn.Sequential(*body.children())) * 2; nf
Now that our head is ready, we can pass our head and body into a nn.Sequential() and initialise our head.
We can now pass our model into a Learner(). Since Fastai uses discriminative learning rate, we need to spilt the model so that each set will be trained at a different learning rate accordingly and the split function can also be got from the same learner.py notebook. Also, we’ll use ranger optimiser which is nothing but a RAdam optimizer passed to a LookAhead().
def _xresnet_split(m): return L(m[:3], m[3:], m[1:]).map(params)
learn=Learner(dls, model, loss_func=MSELossFlat(), splitter=_xresnet_split, opt_func=ranger)
Now our learner is ready! We shall freeze the learner and train it and roughly after 10 epochs we can see the validation loss heavily drops and now we can do a prediction on an image.
As we can see, the loss has not yet started to shoot up which means we can still train for few more epochs with reduced learning rate. Also, we can unfreeze the model and train for a few more epochs to get even better results. Another trick that could improve the accuracy is to increase the size of the images, say from too 240 to 360, and train it.
You can get the full code from the GitHub link below and check out Jeremy Howard’s course on Deep Learning to know more about Fastai.
Permalink Dismiss GitHub is home to over 50 million developers working together to host and review code, manage…
Thank you and keep learning!