A.I. Odyssey part 2. — Implementation Details

This is the follow-up to my story “Use your eyes and Deep Learning to command your computer”. Here, I’ll go into more details about the implementation of the eye motion detection. So if you haven’t checked out the original post, you should do it now!

Finding the eyes

The main problem with the approach described in the article for finding the eyes, is that HAAR Cascades, although very accurate, tend to have jitter in the position and shape of the bounding boxes. Even if this does not appear as a problem — because it looks like the eye is always centered — it completely messes difference frames.

Image for post
Image for post
Gamma motion with no eye tracking (jitter)
Image for post
Image for post
Gamma motion with eye tracking
Image for post
Image for post
Eyes in current frame (Left), previous frame (Center) and difference (Right)

Neural network

No pooling

I have chosen to use one convolutional layer, with no pooling as the image were pretty small (24px wide). This ensures that we keep as much information as possible.

Sharing the weights

Also, I could have set the weights for each eye to be shared, but I chose not to just in case, because of the slight differences in pose and shape of the eyes. I did not have time to test whether this was helpful, but sharing the weights would have made the model lighter.

Making predictions & Multithreading

It is absolutely essential to run the classifier in a separate thread from the webcam/eyes detection. That is because the model takes quite some time making a prediction (tens/hundreds of milliseconds, but it matters) and we would miss what’s happening in the meantime[a]. As the eye motions are quick (~1sec), we want to capture them at the highest framerate possible, and then make predictions on the latest frames available[b].

Making the predictions

The goal of this project was to evaluate the feasibility of eye motion recognition with deep learning and a laptop webcam. As such, I did not spend much time on the last step of the process (using the predictions to trigger commands on the computer). However, I wanted the software to work as well as it could.

Final words

Thank you again for your interest and support!

Written by

Deep Learning Scientist @ L’Oréal AI Research | Creator of AI-Odyssey.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store