Basics of using eye-tracking in Hololens 2

The gaze is a very powerful resource that we, as UX designers, should know when designing an experience for mixed reality, if we want the user to find the design we propose satisfactory and convincing.

Plain Concepts Design Team
Plain Concepts
8 min readOct 15, 2021

--

Designing an intuitive user interface in mixed reality can be complicated, especially because as designers, it is sometimes difficult for us to be able to access this type of device to test the interactions we propose, so in this article, we will tell you some tips to take into account when designing user interactions based on gaze.

Overview of Hololens 2

At Plain Concepts, as the first Microsoft partner in Spain in mixed reality, we are specialized in developing customized solutions for those companies that decide to advance in technology.

We design applications for Hololens 2, which are Microsoft’s mixed reality glasses. These glasses were released on the market in 2019 and are equipped with eye-tracking and hand tracking technology, which means they are capable of detecting the hands without using any additional device, this contributes to the user feeling freedom because they don’t have to hold any controllers to interact.

Short video by Javier Cantón, Research Team Lead at Plain Concepts, in which he tells what his feeling is when trying Hololens 2 for the first time.

Hololens glasses have their own processor and the battery has an approximate autonomy between two and three hours. They are ergonomically designed and those who wear glasses can use both without a problem.

Mixed reality in Hololens 2 is displayed through a small rectangular window that works with the helmet. This rectangular screen that the user watch is known as a holographic screen and allows him to see the digital content integrated into the room in which he is at that moment.

Thanks to the incorporation of eye-tracking, Hololens 2 can be used by different user profiles.

The glasses have a calibration system that makes the experience more comfortable since in this way both the alignment of the holograms and the hand tracking or interaction with the hands are customized to the physical characteristics of the user. It is necessary to carry out this calibration, for example, if it is the first time that the glasses are used or if the calibration profiles are deleted from the device.

As with voice commands, whenever eye tracking functionalities are activated, the user has to accept the permissions.

Head tracking and Eye-tracking

Hololens 2 have two key ways to detect the intention and focus of the user:

  • Head tracking (present in Hololens version 1 and 2)
  • Eye-tracking (only available in Hololens 2).

Both technologies help to understand the FOV (user’s field of view).

Head tracking

Head tracking is used for the glasses to orient themselves in the wearer’s environment, detect the position of the head and determine the position of holograms in the real world. When the user logs in, the holograms are positioned at the origin known as 0,0,0.

In order for the user to realize that head tracking is being used, we must use a visual clue or cursor that should be smaller than the objects so that it does not interfere with the action.

We should not rely on head tracking to obtain information about where the user is looking because they may be looking elsewhere.

Using head tracking for a long period of time can cause fatigue and discomfort for people with cervical conditions. Also, it is a slow type of interaction for the user, but it can be very useful when the interaction requires a lot of precision.

Types of head tracking

Head gaze and commit. The user moves their head towards a specific object and complements the action with a voice command or a gesture.

Head gaze and dwell. This interaction is indicated for those cases in which the user cannot use voice or gestures. When the user keeps his head towards the object for a period of time, a cursor with some visual relevance is added.

Eye-tracking

Eye-tracking is a technology that allows us to understand, as designers, what is the area in which the user is focusing and what is their intention. It is done through a visual path tracking and has several interactions uses in Hololens 2. The device has eye-tracking cameras and at login, it performs an iris analysis to authenticate the user.

We could say that in Hololens 2 the gaze works as an input device, as it does in a computer the mouse.

For the most accurate and user-friendly interaction, the object should be no more than 2 meters away and its size should be between 5 and 10 cm.

Microsoft recommendation for target size at a distance of 2 meters.

When the interactions that we design are based on eye tracking, we will use attentive holograms, they are objects that when looking at them show an effect to inform that the system correctly interpreted what the user is looking at. We do not use a cursor because it can distract the user, it is better to use subtle highlights or small animations to indicate that the hologram is being selected.

Tipos de eye gaze

Eye gaze and commit
As with head tracking and commit, eye gaze and commit means that the user looks at an object, maintains it for a specified time, and confirms the action with a voice command or gesture.

Eye gaze and dwell
We can use it to select objects without using gestures or voice, the gaze is kept for a certain time
, the object shows that the gaze is being directed towards it through subtle animations or color changes and after that period of time it is triggered activation. In the same way, the deactivation would take place, the user looks at the object, keeps the gaze and is deselected.

Microsoft Recommendations for Designing Dwell States

Pros and cons of using eye-tracking

Pros

  • It is very easy to use. We always use the gaze when interacting with any system. In addition, the eye is a very fast muscle that provides great speed of interaction.
  • Sometimes, the interaction with virtual reality devices produces fatigue in the arms, the use of interaction through eye-tracking helps us to prevent the discomfort caused by dealing with gravity.
  • Interaction can be faster with the gaze than with the hands.
  • It is hygienic, and more in the times in which we find ourselves with COVID-19, it allows us to interact without having to touch anything.
  • It guarantees that we have the user’s attention, something also very interesting from the UX point of view because it tells us what the user is paying attention to. We could even go one step further by detecting the mood, the feelings …
  • Allows the user to have their hands free. There are times when they cannot use their hands to interact in the system.
  • Eye-tracking tends to produce a “wow” effect on users, many of them get the feeling that it is magic or that the system is reading their minds.

Cons

  • Not everyone is able to control the movement of their eyes, sometimes involuntary movements occur. The interface should separate involuntary eye movements from intentional gaze gestures.
  • There is an effect that in virtual reality is called “Midas touch”, sometimes it is difficult for the system to decide if the user is simply inspecting the scene or if they really want to perform an action. That is why sometimes there are objects that are activated by mistake.
  • This technique is not suitable if we need to design an experience in which the user has to manipulate small objects and requires interaction with precision.

What situations are the most suitable to use eye tracking in Hololens2?

  • To select holograms.
  • To move through a scene and for the user to position themselves in it.
  • For general navigations.
  • For reading texts, it is very pleasant for the user to see how when they reach the end of a paragraph, the system scrolls in a smooth and controlled way so that they can continue reading.
  • To perform panning and zooming on maps, like the previous point, it is a surprising interaction for the user and therefore with many possibilities.
  • To select large objects.
  • When we need to create heat maps because a lot of information is obtained from what it is that triggers the user’s attention.
Heat map that shows the amount of time the user has been looking at the object and that it changes color as time progresses.
  • To highlight elements, giving interaction feedback.

What aspects should we pay special attention to when using eye-tracking?

  • The gaze is an element that is always active. We must pay special attention when designing the experience because it can be very frustrating for the user if they look for a certain period of time without the intention of activating an object, an action is triggered.
    The feedback should be soft, Microsoft proposes for this what we have commented, the use of “attentive holograms”.
    In general, in AR / VR / MR experiences it is necessary to be subtle and restrained with the effects that we offer the user.
  • If the size of the target is too small. As we mentioned before, eye tracking is not recommended for selections that require precision because we can fatigue the user in the same way that it can happen when they try to read a text but the body of the font is too small.
  • Light changes. We must keep in mind that the user may lose precision in the interaction until his vision adapts to the change in lighting. Sometimes the user must recalibrate the device.
  • If we need the user to perform tasks that require smooth movements, such as drawing or annotating, eye gaze is not recommended, it is easier to use hand gestures or a head movement.

Eye gaze targeting vs Head gaze

To summarize…

Eye-tracking is a very powerful UI, it gives the user the feeling that the system reads their minds, it is an interaction that, if well designed, gives the experience a magical feeling.

The use of eye-tracking helps the user not to experience fatigue in their arms, because what we remember of Tom Cruise’s gestures in the Minority Report movie would exhaust the user and is something that as designers we can fall into when using mixed reality devices.

And for interactions that require precision, it is best not to use it because the eyes often make a lot of very fast movements.

This article was previously published on the Plain Concepts blog.

If you feel like it, you can follow our UX/UI channel on Medium ;)

--

--