Designing the Perfect VR Keyboard for Hand Tracking

Lessons learned from building 6 -now 7- VR hand tracking prototypes.

Danny Yaroslavski
5 min readAug 24, 2020

The Wild West for UX Design

VR hand tracking is still very much in its infancy — when it comes to designing UX in VR it’s very much the Wild West.

One of the biggest unsolved questions with hand tracking is text entry. Unlike with physical VR controllers, hand tracking creates several challenges for UX. And now that Oculus Quest released their WebXR API for hand tracking, I took a stab at tackling the problem.

I’ll be going over the current limitations of hand tracking, the prototypes I built, and some of the key takeaways that I took away from it all.

To note: the Oculus Quest does have a built in option for text entry using pinching & raycasting. It’s a little clunky, slow, and far from ideal but I do have some guesses for why it’s the default: likely because of its similarly to the headset’s controller method for input as well as some of the Oculus Quest’s limitations on detecting most other poses reliably: more on that below.

Hand Tracking Limitations: Occlusion and Jitter

Before starting, it’s important to cover what some of the limitations are when it comes to hand tracking, specifically on the Oculus Quest.

Designing for hand tracking in VR isn’t so much about ideals as it is about tradeoffs.

It’s easy to conceive what the ideal UX might be by imagining the interaction outside of a VR headset. Unfortunately, the very real limitations of the hardware and software can make many of those ideas dead on arrival.

  • Hand on hand occlusion: Overlapping hands will lose tracking for both hands. Implication: No typing on the opposite hand.
  • Finger occlusion: With hands facing downwards, your ring and pinky fingers are often occluded by the fingers in front of them just enough that hand tracking can’t properly track them. Implication: No touch typing in general.
  • Pose occlusion: pointing with your hand and then angling it forward (or worse, forward and down) loses tracking for that hand if the index finger is too in line with the view. Implication: No finger ‘guns’.
  • Jitter: hand and finger positioning is still fairly jittery and gets worse under poor light conditions; however it’s not uncommon for a static pose to jitter entire centimeters between frames even in the best conditions. Implication: Minimizing the use of small, precise movements.
  • If using any part of the hand to raycast, that jitter gets even more pronounced as you get further from the hands or fingers. Implication: Seriously, no finger ‘guns’.

Prototypes

I created 6 (update: 7) little demos over the course of a week or so. Press the play button on each embedded tweet video to see the prototype in action. The first few prototypes felt awfully gimmicky, but by the last few, I could see myself actually using them in day to day use.

Tech stuff: All of the prototypes are running in the Oculus Quest browser. They were built with three.js and TypeScript, and use the Hand Tracking WebXR API.

Draggable Keys

Use the left hand to drag the row of characters, and select each character by tapping with the right hand.

Numbered Clusters

Hold up a number of fingers with the left hand to choose a row of characters, and use the right hand to pick out a character in that row.

Vertical QWERTY

A QWERTY keyboard but layed out vertically in space so that down-strokes can be used to select characters.

SciFi Keys

Select a character by hovering over a cluster and swiping in one of four directions.

Morse Code

Use the right hand’s index and middle fingers to tap out morse code, and the left hand’s index and middle fingers to either advance or delete characters.

Mini QWERTY

Use the right hand to hover over the character selection and the left index and middle fingers to select or delete characters.

After building out all of these, I most enjoyed playing with the Morse Code variant. In terms of functionality however, the Mini QWERTY design wound up being the most practical. It had a short learning curve, was very accurate, took up very little virtual real estate, and allowed for a (relatively) high WPM.

Middle Pinch QWERTY

A day after publishing this article, I sat down and created one last prototype.

This variant took up very little real estate as well- but best of all- it allowed for the highest WPM of all the prototypes. I initially expected to use this variant with my hands facing downwards but the tracking was simply too poor for it to be useable. Nonetheless, hands oriented sideways ended up feeling just as comfortable.

Takeaways

These are not by any means hard and fast rules but just things I’ve noticed after building and testing the experiments.

  • Pinching with your index finger to your thumb is the most consistently recognized hand pose. It is also the only reliable form of raycasting: both the index finger and thumb are never occluded by other fingers, including in most hand orientations. With today’s tracking, I’d only recommend raycasting this way.
  • Projecting along a common axis (say, the keyboard’s normal) is far more preferable to raycasting, whether from a finger tip or hand- anything to remove the finicky orientation component of the hand tracking data from the input scheme.
  • Pinching with your middle finger to your thumb is often well detected too. Ring and pinky finger pinching is also somewhat reliable if your hands face palms up towards you.
  • Because hand tracking is finicky, mistakes are super common. As such, dedicating an action to undo/backspace is preferable to not break the flow of typing when you inevitably make mistakes.
  • ‘Depth’ based interactions require a good amount of precision and as such aren’t the best choice for differentiating between keys and modes. On the other hand, they’re great for single action confirmations (like when you want to fully press in a button to confirm an action).
  • Swiping in the air feels most comfortable when done in a downwards direction. Other directions are a bit more strenuous to perform over a longer period of time. Changing swiping directions in the air felt a bit less intuitive and was harder to develop muscle memory for.

Further Experiments

I saw several different prototypes created by others also tackling this space. Some notable ones were an ASL American Sign Language demo as well as a fingers-to-thumb 6-dot Braille demo. These worked 90% of the time, with a few exceptions for some letters that either didn’t track well or, in the case of some letters in ASL, at all.

Many readers offered suggestions to explore more predictive-type keyboards, with lots of callouts for Swype-like keyboards. I imagine that this could work; but with the performance of such a keyboard mostly being tied to how good the predictive engine is, it wasn’t something I pursued this time around.

I hope you enjoyed this deep dive into VR hand tracking keyboards!

Want to see more? I’m always working on various VR projects and post regular updates on Twitter. If you want to see more experiments like these, give me a follow: @dannyaroslavski

--

--