Eye Tracking and User Interaction in a Spatial Interface

8 min readSep 26, 2017

Although spatial interface design is itself a nascent topic, one can’t help but feel that it is an incomplete field on both the software and the hardware sides. In this article we will explore some of the issues that currently afflict users of virtual and augmented reality, and then introduce some concepts of a more idealized user experience that utilizes eye tracking, among other technologies, and new interactive patterns alongside what we currently have.

Houston…

The fact that there are so few truly useful productivity applications in VR and AR as of mid-2017 can be attributed to a variety of issues.

First, en masse software development for spatial environments has only really just begun. Up until a couple of years ago it was a true niche. Even now it can barely be considered thriving. The Oculus Rift got the hype going, HTC Vive brought up accurate room scale experiences, and Microsoft’s HoloLens gave us the first truly viable AR platform. Now Apple has joined the fray with ARKit and by purchasing SMI, probably the world’s leading eye tracking company (Tobii might disagree).

Second, it is extremely difficult to change some of the fundamental assumptions we make about how software works and how people use it. There’s also very little data thus far to lead us in a given direction.

Third, many VR/AR software companies are developer heavy, when at this nascent stage we really need a lot more Spatial UX and UI folks working to figure out the best practices for the new modalities. At the moment there is little guidance. There is no bootstrap for VR. There is no Sketch or Adobe XD for rapidly sketching out design patterns. These applications simplify and make more accessible patterns that have been shown to work well. Since things are still so much in flux we have nothing yet to simplify.

Fourth, the hardware technology itself is still quite early, with new capabilities being added at a rapid pace. This makes it hard for organizations to keep up with the changing capabilities and needs. Let’s look into some use problems and then some possible solutions.

Fifth, the development software itself is either very young or has been retrofitted. The most advanced are game engines, Unreal and Unity.

Lastly, most regular designers and developers do not have the skill set to create engaging 3D interactive experiences. Those that do are game developers, who are not that interested in creating productivity software or ergonomic user interfaces.

All of these can be summarized by noting that the industry is early and the really fundamental/foundational pieces are still falling into place.

Biomechanics and Physics

Gravity is a fickle mistress. Granted we couldn’t survive without it, but we sure get beat up and worn out by the constant and inexorable pull toward the center of the Earth. Any interface that requires you to lift your arms is not a good interface for productivity in the long run. In addition to this we run into a problem of standing vs sitting.

Despite many studies showing that sitting is killing us, I for one don’t plan on working standing up for 8, 10, 12, or more hours a day. I will be ensconced in a chair. This means that I have a chair in the way of my arms if I’m trying to reach something in the virtual environment. I might also have a desk in the way. Or a floor if it’s low down. Many things will be out of reach, and that’s just no good.

Stability is also an issue. The freedom that comes from 3 axis tracking of motion controllers or hands is attended by a caveat: we’re physiologically twitchy, and what’s more the tech isn’t actually that accurate when compared to a mouse. To illustrate this point let’s do a test: pick something about ¼ to ½ inches in diameter, located at arm length and shoulder height to you. Try to hold your primary hand index finger precisely over that item for 10 seconds without moving and without touching it.

Now try pushing your finger against the item and holding it there. Now try holding a mouse cursor over a single letter on the screen for 10 seconds. If you’re a normal human, the first one was pretty shaky, and the second less shaky, and the third not shaky at all. We rely on that physical resistance for precision. Our heads aren’t as bad as our outstretched arms, but our eyes are even worse than our hands.

Depth perception is another issue. Despite the stereoscopic displays used in current HMDs, it’s challenging to be precise in the z-axis. The focal plane isn’t right, and generally things don’t appear to be the proper scale. HoloLens is better than the VR headsets in this respect, but it also has the advantage of real physical environments to work from. Let’s look at some things that can help the situation. Don’t be discouraged!

It’s not a cliff. Here, have some stairs.

I Need My Fix

I have long considered an AR HMD with eye and hand tracking, coupled with a motion tracked stylus and a mouse+keyboard and physical monitor to be the gold standard of productivity. The mouse and keyboard are not easily replaced in precision and ergonomics over long periods of use.

Research and Define Best Practices

As we mentioned, at the moment the fundamentals of design in spatial interfaces are fluid and changeable. Every developer is figuring it out for himself, and over time we will have standards. It behooves us to accelerate this process by focusing specifically on these practices, testing and investigating various ideas, and measuring the outcomes.

The more of us doing that the faster we’ll have a set of general rules that will form the foundation of Best Practice Guidelines. At Holographic Interfaces we do a lot of experimentation internally, each intended to address some functionality currently available to users of normal computers, tablets, and laptops.

Promote Training for Designers and Developers

It’s a huge mistake to throw developers and designers into a 3D environment without proper training and expect them to figure it out on the fly. It will inevitably throw budgets and timelines tremendously out of scope and risk damaging the prospects of future projects in this space, both within that organization and across the industry.

Easy and Accurate Typing

When using our productivity gold standard setup this is easy: just type on your keyboard. But for any other scenario this is a tedious endeavour at best. Eye tracking affords us the opportunity to use a Swipe-style keyboard, where we define the beginning of a word, swipe across the needed letters with our eyes, and define the end of the word, with a click or a blink. [edit: and before we can even publish Microsoft has added such a feature to Windows 10!

Check it out: https://www.youtube.com/watch?v=X1UWxvUCkPU

Accurate Selections

Knowing what a user wants to do is a complicated thing. A major part of this is informing the user of what we *think* he wants to do before he does it, letting him correct us as necessary. This is what mouse over effects are all about on websites. Another thing we can do, once eye tracking is en vogue, is utilize multiple tracking points. For example, say we want to select all the items in a traditional folder. We could look to the upper left of the window, hold left click, look to the lower right to form the selection rectangle, and let go of the mouse button.

I hear you complaining that this is easily done with a mouse already. That is true. However with this methodology you can do the same motion using motion controllers, gamepads, and you can do so dimensionally rather than in simple 2D. Or you may wish to ‘confirm’ the user’s intent by requiring both a look and a grab/click for certain functions. Using eye tracking in conjunction with feedback to the user regarding what’s selected, and confirming with a hand motion will go a long way to alleviate these infernal vagaries!

Hyperloop AR App concept by Holographic Interfaces

Predictive Interfaces

As touched on in a previous article, visual complexity in spatial interfaces should be kept to a minimum, but to do so requires intelligently understanding user intentions and eventually utilizing an AI to regulate the interface the user is presented with at a given moment.

In order for us, and the AI, to understand user intent we need to track and measure user behavior. Eye tracking affords us a wealth of information regarding user behavior. We know what the user is looking at. We know what word he’s reading. We know how long he’s looking at an image. We know how much he’s hunting for something before click on a certain button or object, which indicates it wasn’t immediately visible to him. Combined with general HMD motion and hand motion we can tell his posture, his energy levels. We can see if something scares the user by looking at pupil patterns.

Sure, these things will be used by groups like Google to fill your head with their paid for advertisements. But it will also be used by developers to make your computing experiences substantially easier, more pleasurable, and frictionless.

Part of the design work for a spatial Operating System interface

Using the Z-Axis

As we discussed in our article on spatial databases and spreadsheets ( link ), using eye tracking to navigate through a third dimension of content presents far-reaching powers to manipulate more intuitively a greater volume of content and applications. In one concept I visualized concentric spheres of program content that can rapidly be sifting through using eye tracking. Combind with the naturally increased FOV of a headset this techinque would allow the navigation of dozens of ‘windows’ without so much as a click.

An example of moving menus into virtual space

Combining Screens and Virtual

Eye tracking allows us to know where the user is looking, but what may be nonobvious is that this rule applies to both screens and virtual objects. If we consider a design like ours above, where we’ve moved the Photoshop menus and a reference document into virtual space, the system could track what the user is interested in at that moment, irrespective of how it’s being displayed.

Don’t Rush

The mistake in something as complex as a new computing platform is to think things will be Windows 10 and iOS overnight. This is going to take a long time, and it’s going to be a labor of love for the vast majority of folks for years to come.

Don’t Worry

The biggest danger of problems is not realizing that they exist. We hope that we’ve helped lay bare some of the challenges the spatial industry faces, and perhaps generated a miniature mental model for how to address some of them. If you have any questions, would like to discuss these issues, or need us to design and develop a VR or AR project, please email us at contact@holographicinterfaces.com

http://cebp.aacrjournals.org/content/19/11/2691.long
https://link.springer.com/article/10.1007%2Fs00125-012-2677-z
https://link.springer.com/article/10.1007%2Fs12529-010-9075-z