Gaze-Based Interaction: 30 Years in Retrospect and Future Outlook

V2XR
4 min readAug 27, 2023

--

History of Eye Tracking Technology

The development of eye tracking technology can be divided into three stages over the past 30 years:

- Before 2000, eye gaze research mainly served academic domains like physiology, psychology, and ophthalmology starting in the 19th century. It aimed to understand how human eyes work and how people process information consciously and unconsciously (Javal, 1990).

- 2000–2020, with the rise of IT industry and attention economy, lightweight and portable eye trackers found use in web analytics, advertising, etc.

- After 2020, eye tracking expanded into more areas, especially consumer XR headsets like Microsoft’s HoloLens 2 (2019) and HTC VIVE Pro Eye (2019) for research.

Eye tracking techniques include:

1. Electrooculography (EOG)
2. Scleral search coils
3. Video-based pupil monitoring
4. Infrared corneal reflection

Consumer XR devices mostly use infrared corneal reflection. It utilizes the difference in IR reflection between the cornea and iris. IR illuminators and cameras track gaze direction.

Physiology and Psychology of Eye Movements

The six extraocular muscles control eye movement:

  • Superior rectus: moves eyes up
  • Inferior rectus: moves eyes down
  • Medial rectus: rotates eyes towards nose
  • Lateral rectus: rotates eyes away from nose
  • Superior oblique: rotates eyes up and inward
  • Inferior oblique: rotates eyes down and outward

Two important eye movement metrics in HCI are fixations and saccades. Fixations are when the eye pauses at a location for 200–300 ms. Saccades are rapid jumps between fixations, ranging from 1° to 45°. Saccades over 30° often involve head movement.

Categories of Gaze Interaction

Past applications of gaze tracking in HCI can be classified as:

I. Active

Gaze as input for selection, confirmation etc. E.g. gaze-based dialing/unlocking.

Apple Vision Pro’s eye-hand collaboration also enables active gaze input. See previous article on 10 years of eye-hand HCI research behind it.

Gaze input also works for game control, e.g. weapon switching in PSVR 2.

Active input requires high spatial accuracy and low latency from eye tracking.

II. Passive

Optimizing rendering using real-time gaze data. E.g. foveated rendering only renders high resolution at the fovea.

Gaze-contingent varifocal displays adjust optical focus based on gaze. This avoids Vergence-Accommodation Conflict (VAC) and associated visual fatigue in fixed-focus XR displays.

Meta Varifocal Prototype

Passive use cases demand high temporal resolution (sampling rate and end-to-end latency < 30ms ideally) from eye tracking.

III. Expressive & IV. Diagnostic

Driving digital avatars using gaze creates more natural eye contact and emotions.

E.g. Apple’s Eyesight feature reconstructs user gaze on external display to improve social connection.

Expressive and diagnostic applications have lower eye tracking requirements compared to interactive use.

Eye Tracking Performance Requirements

XR gaze interaction promises more natural, comfortable and immersive user experience, but also imposes higher demands on eye tracking accuracy and latency.

Key metrics are **spatial resolution** (accuracy and precision) and **temporal resolution** (sampling rate and end-to-end latency).

Current XR eye tracking capabilities:

— -

In summary, XR gaze interaction remains an active research area as existing solutions cannot yet meet the spatial and temporal sensitivity of human eyes, especially for integrated eye tracking. Apple Vision Pro pushes the state of the art for consumer devices, potentially enabling new gaze-based experiences. More advances in this space can lead to more natural and intuitive XR interactions.

— -

See original article for references.

--

--