A Look at the Future of Gesture Design
What gesture-based interactions reveal about a multimodal future
Recent advancements in machine learning, computer vision, natural language processing, and sensor technology enable us to interact with the digital world like never before. Previously constrained to mouse, keyboard, or touchscreen, we can now use gesture and voice input as increasingly popular modes of interaction.
My recent visit to NASA Jet Propulsion Laboratory (JPL) revealed how gesture-based interactions were helping designers and engineers collaborate in mixed reality:
The engineers are shown using ProtoSpace — a HoloLens application that enables teams to collaborate around 3D models at any scale. Earlier this year, I was fortunate enough to work with JPL’s immersive technology team for my master’s project, looking at how gesture interfaces impacted team collaboration.
This project sparked my interest in understanding where gestures fit into the future of interaction design.
Below, I share my initial impressions of gesture interfaces and look at what gestures might reveal about a multimodal future:
First Impressions of a Gesture-Based Interface
As I put on the HoloLens, a 3D spacecraft appears before me. I walk around the semi-transparent structure and peer into its internal mechanisms. Seeing first-hand how ProtoSpace brings data into the physical world (and away from the 2D screen) revealed to me the vast possibilities of immersive technology. I could envision mixed reality transforming the way we work, particularly in domains like engineering, architecture, and scientific research.
I was delighted to be able to walk around the object and manipulate it directly with gesture control. The small set of gestures enabled me to quickly learn the interface and to explore the system functionalities.
However, I noticed some limitations to gesture interactions:
1. Demanding precision from users. The airtap gesture demanded precision from users, and often required fine motor control, which can be exclusionary. For this reason, it doesn’t seem ideal to offer gesture control as the sole interface.
2. Lack of feedback. Gesture controls lack physical affordances, which means the interface needs to communicate clear and immediate feedback. This becomes a problem when users don’t know they’ve activated a certain state (intentionally or unintentionally).
3. User fatigue. With HoloLens gestures, users are required to raise their hand over their heart. This can be fatiguing when combined with active discussion and constant standing, as was the case with ProtoSpace.
4. False positives. Hand gestures that occur naturally in conversation were sometimes misinterpreted as system controls. Though unavoidable, false positives can be minimized through design.
5. Inability to support complex interactions. I found the core gestures easy to remember but had trouble with more complex interactions. For tasks that required multiple steps (e.g, navigating through submenus) or precise interactions (e.g., rotating an object 35 degrees), gestures were not ideal. The human hand moving across 3D space is not very precise.
My team and I concluded that gesture controls alone were not enough to support the needs of NASA’s engineering teams. We considered voice input as a secondary interface, but the disruption it would cause in a group setting ruled out this option.
We discovered that combining gestures with a physical controller could help engineers carry out more complex tasks. The controller minimized some of the problems we observed with gestures — for example, users could access submenus with the controller rather than having to navigate through the menu using gestures. For ProtoSpace, our team recommended using the Nintendo Joy-Con, which can support a wide range of interactions and more precise ways to rotate and scale 3D objects.
Gestures as part of a multimodal solution
Our solution combined gestures with a physical controller to support added functionality and to give users more choice in interacting with the system. Each interface alone is quite limited in capacity, but coupling them together can help users carry out more complex tasks.
A designer’s role involves considering when gestures will add value, and when other modalities may be more fitting. Looking across gesture, voice, and tangible interfaces, when and how might each modality be used? Are there scenarios in which a multimodal approach could be a good solution?
Observing users in context will usually reveal the answer.
With ProtoSpace, we discovered that group scenarios constrained the use of voice commands, which meant gestures became the primary interface. We identified the constraint and designed a controller interface to be used in conjunction with gesture controls. While the combination of gesture and voice may work well in individual settings, it all comes down to finding the best combination of modalities to support your users within their contexts.
Thanks to advancements camera and sensor technology, microprocessors, and machine learning, we can now track body movements, hand gestures, and even finger movements with increased precision. Soon we may see these technologies more commonly in our homes and workspaces.
So where do gesture-based interactions fit in? And how can gestures be used to created meaningful experiences and useful solutions? These are current areas of debate, and as designers, we can help shape the future for gesture-based interactions.
“Gestures will form a valuable addition to our repertoire of interaction techniques, but they need time to be better developed, for us to understand how best to deploy them and for standards to develop so that the same gestures mean the same things in different systems.” — Don Norman, Natural User Interfaces are Not Natural