A Large-scale Gaze Tracking Dataset, Method, and Application for Robust 3D Gaze Estimation

Towards Robust Gaze Estimation

Christopher Dossman
Nov 1 · 3 min read

This research summary is just one of many that are distributed weekly on the AI scholar newsletter. To start receiving the weekly newsletter, sign up here.

Gaze direction is an important cue for guiding conversations and other social interactions. It helps to understand people's intents, desires, state of mind, interest, and attention in social settings.

The ability to accurately estimate the human gaze direction also has many applications in assistive technologies for people with physical impairments, human-computer interaction, augmented reality, virtual reality, consumer behavior research, visual attention analysis, and more.

In the past, gaze estimation has been achieved through specialized hardware. But thanks to deep learning-based techniques, some advanced steps towards fully unconstrained gaze estimation have been achieved. For instance, researchers have so far managed to achieve high accuracies to variations in gaze, head pose, and image quality. However, challenges such as achieving highly accurate and highly varied gaze data estimates still remain.

Gaze360: Physically Unconstrained Gaze Estimation in the Wild

In this newly published paper, researchers present an approach to help deal with the gaze estimation task and narrow the existing performance gap. First, they describe a method to collect annotated 3D gaze data efficiently in arbitrary environments. They then use the method to obtain one of the largest 3D gaze data set which they are calling Gaze360. Hence, Gaze360 is a large-scale gaze-tracking dataset and method for robust 3D gaze estimation in unconstrained images.

It comprises video content of 238 subjects in indoor and outdoor environments with labeled 3D gaze across a wide range of head poses and distances. According to them, it is the largest publicly available dataset of its kind by both subject and variety.

Gaze Estimation Models

The researchers also train a variety of 3D gaze estimation models on the dataset before finalizing on a model that uniquely takes a multi-frame input and employs a pinball regression loss for error quantile regression to provide an estimate of gaze uncertainty.

Gaze360 was evaluated versus conventional datasets by means of a cross-dataset model performance comparison. Not only that, the researchers then go ahead and show how the model can be applied to real-world use cases including the estimation of a customer’s focus of attention in a supermarket.

Why Does this Matter?

This work basically demonstrates a methodology that can be used to help collect annotated gaze data at scale and using it to generate a large and diverse dataset that’s suitable for deep learning of 3D gaze from images and videos. Its value is demonstrated through cross dataset performance comparison versus three existing 3D gaze datasets, as well as through the application to unconstrained unseen imagery from YouTube videos.

Both quantitative and qualitative evaluation results show that the proposed approach achieves higher accuracies than the state-of-the-art methods and is robust to variation in gaze, head pose, and image quality.

The researchers hope that the application of the model and dataset across a range of fields will help better leverage gaze as a cue to improve the vision-based understanding of human behavior. I think the work goes a long way to help improve existing gaze estimation literature and models and has significant potential to help achieve robust 3D gaze estimation.

Dataset can be accessed here: http://gaze360.csail.mit.edu/

Read more: Physically Unconstrained Gaze Estimation

Thanks for reading, comment, share & let’s connect on Twitter, LinkedIn, and Facebook. Stay updated with the latest AI research developments, news, resources, tools, and more by subscribing to our weekly AI Scholar Newsletter for free! Subscribe here Remember to 👏 if you enjoyed this article. Cheers!

AI³ | Theory, Practice, Business

The AI revolution is here! Navigate the ever changing industry with our thoughtfully written articles whether your a researcher, engineer, or entrepreneur

Christopher Dossman

Written by

Deep Learning Engineer, Teacher, and Entrepreneur

AI³ | Theory, Practice, Business

The AI revolution is here! Navigate the ever changing industry with our thoughtfully written articles whether your a researcher, engineer, or entrepreneur

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade