My M2M : August - Week 1
This month’s challenge is to develop a project for a science fair I plan on entering into. The project must make use of genetic algorithms or neural networks, and be as ambitious as the rest of my monthly challenges.
This month’s goal is to use Deep Learning to create a translator for sign language, specifically American Sign Language. So far this week, progress has been slow for a number of reasons. For one, I’m still a newbie when it comes to Deep Learning, and I’m pretty far out of my comfort zone. Having said that, that’s part of the reason why I chose such an ambitious project, to close that gap faster.
Secondly, I’ve been pretty busy this week. Between working on a couple of other projects, catching up on some books I had to read and going round in circles dealing with some infuriating non-sense from Amazon customer service in a foreign language (long story) as I try to get a replacement for my faulty Kindle.
But let’s skip the excuses, and get straight to the progress I’ve made so far this week. My first task for the week was to find a suitable data-set to train my model. Fortunately, there seems to be an abundance of data for American Sign Language. While I eventually hope to mix multiple libraries of data, I’m starting with one data set with > 12,000 data points!
All of these data points are .mov files, so I’m planning to split these files into 5 frames per second. I haven’t been able to download all the .mov files yet due to space constraints on my MBP, but hopefully this will help alleviate these problems.
Additionally, I will have to do a bit of cropping and cutting. So far, the best free service I’ve found to do this is VLC Media Player. I’ve found some code to execute these functions in terminal, so in the morning I’ll try to find a way to loop these functions through the entire folder.
Finally, I’ve been looking into what algorithms to try out. As I thought, Convolutional Networks are best for image recognition. As such, my first try will be to pass each frame through a CNN, and record the accuracy. All other results from different algorithms such as Recurrent Networks, MLP etc. will then be benchmarked against this.
For the next week, I’m going to focus on executing my planned pre-processing of every video in the data-set. Then I will format these frames into a Keras-compliant data-set for some supervised learning. After that, I will run a CNN on it, if I have time.
I look forward to reporting the results next week!
