How to Make a Naruto Hand Signs Classifier using Deep Learning
Introduction: How I got the idea and the process of how the dataset was developed.
Naruto is an anime classic that will never go out of style no matter what era. It is an anime that gave you goosebumps 99% of the time (1% deducted for the fillers).
It teaches you about many profound things. Such as, “No matter how hard life puts you down, you have to get back up and move forward,” as said by one of my stoned friends. He was high, but I couldn’t refute him, not because I was stoned too but because I agreed with him completely.
A little context about how I got the idea. I am currently a visiting researcher in Australia at UTS. I am working on research in ML, XR, and quantum domains.
So, being the geek I am, I marveled at the possible implications of making a Naruto game in AR where you make the signs and a jutsu follows up.
Think: When you do the dog, hare, dragon, boar, tiger signs, a fire dragon comes into existence and crashes onto the enemy, killing him if he doesn’t counter-attack with a water or earth jutsu (no dodging allowed, lol).
It was then that this moment of “OMG!!!!” with 4 exclamation marks hit me like a bull on steroids ramming a person covered in red paint.
I immediately got to work. Sadly, it had never been done before, so I had to start from the part of recognizing the hand signs and then working my way up to AR game work. I knew I couldn’t do this alone as I would need hundreds of images of each hand sign which would be a huge time commitment and a lot of work.
So, being the lazy smartass that I am, I decided to enlist the social power of otakus. We had a group back in VIT for anime lovers literally called “ VIT ANIME LOVERS”. I went there and posted about what I wanted to do and many were willing to help out but in the end, they weren’t enough.
So, after getting a certain amount of images, I made a video of myself doing all the signs in order and used OpenCV to get all the hand signs frame by frame while removing the unclear ones. This is how the dataset was developed and this is the link for the Dataset.
Nitty-Gritty Technical Part:
Dataset: https://www.kaggle.com/vikranthkanumuru/naruto-hand-sign-dataset
Github Repo: https://github.com/kanlanc/naruto-hand-sign-detection
You can find the trained models in the Github repo, so thank me by clapping here and staring at the repo. Thanks mate!
Tools used: FastAI
Gets shit done faster with less code and uses the latest methods to improve the performance and reduce overfitting of the model like dropout and other such methods already configured. I wasn’t kidding when I said I was a lazy smart ass.
Environment setup: Kaggle for storing the dataset and Colab for training the model.
If you are asking why not use Kaggle for everything, to be honest, I don’t remember now but it was causing a lot of problems when training and some tools were missing. Also, the GPU time is not unlimited like Colab. Some of you might ask, why not only Colab then? That would be because I would have to store the data in my drive which also has limited space. So, I decided to use unlimited storage of data in Kaggle and unlimited GPU time in Colab. I know! I am awesome and I’m a lazy…
Lol, moving on. Now we have the cream of the entire article coming up.
Output
Despite limiting the dataset to only images of real humans and augmenting it by turning the images left, right, up, and down, a welcome upgrade was that it still works on anime screenshots as shown below.
Concluding the article, I would like to say that for me, the hardest part of the project was making the dataset, hands down!
Hope you enjoyed the article just as I enjoyed making it.
If you have any questions regarding the project, let me know in the comments or talk to me on LinkedIn. Cheers!
More content at kanlanc.com