Exploring Gestures in Different Scales

School of Design, CMU | Seminar III | Research for Design(ing)

Jay Huh

Published in

Research for/into/through design(ing)

9 min readNov 27, 2019

Introduction

Gesture interactions are expected to become much more popular in the near future(Cheryl Falk, 2014). They will not only allow us to interact or communicate with the devices or space where we are located but also leverage the more realistic, full-body experience in virtual space.

*How the computer sees us —* Dan O’Sullivan & Tom Igoe’s, 2004

Dan O’Sullivan and Tom Igoe’s model(2004) of how the computer sees us shows well how our body usage in computer interaction is imbalanced. Still, most of the gesture interactions are designed to use minimal parts of our bodies, especially, several fingers.

Humans are born with a tool kit at least 15,000 years old. (…) The most nearly muscular mentality that we use in computation is pointing with a mouse. We use such a tiny part of our repertoire of sound and motion and vision in any interaction with an electronic system. In retrospect, that seems strange and not very obvious why it should be that way.
— Stanford professor David Liddle, 1993

However, gestural interfaces can take advantage of the whole body for triggering system behaviors. Gestural interfaces allow for fuller use of the human body to trigger system responses(Dan Saffer, 2008).

Through this pilot project, I wanted to explore different scales of gestures from micro to macro and the possibility to leverage them as gestural interfaces.

Gesture types

Informative and Communicative

In order to understand the general landscape of gestures that people use in normal days, I drew a diagram and categorized gestures in types.

The human gesture could be divided into two groups — informative and communicative gestures. The informative-communicative dichotomy focuses on the intentionality of meaning and communication in co-speech gestures (Abner, Natasha, et al., 2015). Informative gestures are passive gestures that provide information about the speaker as a person and not about what the speaker is trying to communicate(Krauss, Robert M., et al., 1996). Scratching, adjusting clothing, tapping, or shivering are great examples of informative gestures. Body postures which reveal a feeling of depression, confidence, or a lack of interest could be also informative gestures.

In contrast, communicative gestures are gestures that are produced intentionally and meaningfully by a person as a way of intensifying or modifying speech produced in the vocal tract (or with the hands in the case of sign languages), even though the speaker may not be actively aware that they are producing communicative gestures(Abner, Natasha, et al., 2015). Body language, symbolic, dietetic, motor and lexical gestures are all included in this category, such as nodding head, shoulder shrugging, raising the eyebrows.

Affordances and Technology

The other interesting thing that I found during this research was that people’s gestures could be shaped by physical affordances or technology. When we look at a chair, we almost unconsciously think that we could sit on it and have expectations of how does it feel like if I sit on the chair based on its shape/form, material, and our past experience. Soft sofa affords people to jump or throw our bodies on it, which a wooden stool could not afford. It is also a usual thing to see people wave their hands to wash their hands under water taps or tissue dispenser. Lightings equipped with infrared sensors afford people to wave their hands, jump, or even dance.

Decision Involvement

The gestures could be categorized in different ways as below, based on whether they are the consequences of decisions or not.

In the unconscious level, a human could feel the atmosphere and receive various channels of information through senses. When the information is extreme (e.g. see a very bright light, hear a really noisy sound, or see a huge bug), people unconsciously react to the environment with natural gestures (e.g. covering eyes with hands, blocking ears with fingers, or screaming). While reacting to the environment, they could perceive the environment and information as well, which enables them to perform actions with intended decisions (e.g. leaving a room, turning off a speaker, or running away from the bug). Sometimes reactions are not related to their following actions (e.g. screaming does not help to escape from the bug), but sometimes they are (e.g. leaving a room and covering eyes fundamentally have similar purposes — protecting eyes).

The unconscious reactions are not always visible — they might be really subtle (e.g. staring or small facial expressions) or even not externalized. For example, when the music volume is a little bit loud, we do not react remarkably but just decrease the volume.

Body Storming

After understanding the general gesture types, I conducted body storming studies with a total of 5 participants. Body storming was conducted in two different ways; gesture-given and function-given. In the gesture-given study, I asked participants to do certain gestures and learned about their feelings, expectations, and the possibility of applying for digital worlds. In the function-given study, I provided a bit more concrete circumstances to participants — smart home environments. I asked them how would they control the sense of sight, hearing, and touch channels of devices through gestures.

These two studies were intended to understand the general mental models of controlling digital information through various scales of body gestures and find common knowledge that would help me further develop gesture interaction principles or toolkits.

Phase 1: Gesture-given Study

For the first body storming, I found a variety of gestures of different scales of human body engagements — from micro finger scales to macro the whole body. Then, I picked several of them to map on each area of the diagram that I developed previously.

While participants are doing certain gestures, I asked them several questions to enable them to elaborate their feelings, reminded memories or past experiences, or any expectations if those gestures could be applied for digital worlds, especially augmented reality(AR) or mixed reality(MR).

Through this study, I could learn such things;

People do care about how they look, especially for the bigger, more noticeable, or faster gestures. When designing the gestures, context(public vs private, quiet vs noisy, inside vs outside, crowded vs not crowded) should be considered. For example, people feel awkward to do wiping, rotating arms, or hula hooping gestures in the cafe but not that much in the park. The speed of gestures also matters. Participants felt a bit better in slower gestures than faster if it is the same context.
The scale of gestures — the more body parts engaged in gestures, the greater or higher-level changes are expected. For example, if swiping with hands could control the part of elements of the screen, swiping with the whole arm and upper body is expected to control the whole screen or the entire space.
There is an inverse correlation between preciseness and speed & scale of gestures. People expect that the greater-scale and faster gestures are suitable to control roughly, in contrast, the smaller and slower gestures are expected to control things precisely.
There is a possibility of taking advantage of semantic or metaphoric correlations between the gestures and their functions as an interface. For example, gestures of opening or closing a lock are easily connected to security, hold mode, or turning on/off. Gestures of scrubbing or tapping belly have the social meanings of hungry or full status, which could be easily connected to similar digital functions, such as ‘please recommend me more options’ or ‘okay, that’s enough’.
Gestures that are frequently used could provide a more natural feeling. Participants did not feel natural to do gestures only for specific tasks and are not used frequently, such as, jumping rope or hula hoop gestures. Finding associations between those gestures with meanings or digital functions was also difficult because they have too strong correlations with specific tasks.

Phase 2: Function-given Study

For the second body storming, I mapped smart home devices and their output modalities on tables like below. According to devices, the visual and auditory channels could be divided into private or public categories. For example, laptop display is considered as a private screen compared to the TV display, while the speaker of the laptop is considered as a public sound compared to the headphone.

This study was done using a method called the Wizard of Oz. While a participant is doing gestures to controlling devices, I manually controlled the certain features of a device (e.g. increasing the volume of a laptop) to provide a feeling of real and to enable him/her to find the most appropriate gestures for himself/herself.

Through this study, I could learn such things;

There is a common mental model for the directions. Forward, the upward, or right direction is perceived as increasing, next, faster, turning on, brighter, or tightening. In contrast, backward, downward, or left direction means decreasing, previous, slower, turning off, dimmer, or loosening.
There are very different mental models of interactions by participants. For example, when participants are asked to decrease the volume, their gestures were different by participants — a rotating gesture to left direction, a slowly falling hand gesture from high to low, a pinch and a drag gesture from right to left, and even a pushing button gesture of remote control. Thus, it was difficult to find a common thread.
The expectations between the scale of gestures and the types of modality channels(private vs public). When people are controlling things that are noticeable to other people as well, such as laptop speaker volume or room lighting control, they don’t feel awkward to use the bigger or large scale gestures. However, when they are controlling the private modality channels, such as the volume of a headphone, that only they can notice the change, gestures should be discreet, subtle, and invisible to avoid awkwardness. Gestures using around palm areas or inside the pocket were suggested.
The scope/directional of controls — controlling a device VS across devices. A gesture doesn’t have to be confined to controlling one device. Based on the social meanings of a gesture, multiple devices could be responsive to it. For example, two gestures, attaching a finger to our lip and covering our ears with hands, both could have a social meaning of silence or decreasing sounds. However, the scope or directional could be different — the first one has a direction to a specific device, while the second one has directions to all devices that make sounds.

Reflection and Next Steps

Through the pilot project, I could gain several crucial insights to design gesture interface systems. At the same time, I realized that the focus should be slightly adjusted for the next phases.

Leveraging Natural Gestures

As I mentioned, the gestures to change the volume or turn off a speaker are different by participants according to their understanding of technology or the mental model of the controllers. However, for natural gestures, there are common understandings or behavior because they are very physical-based actions. For example, if there is extremely noisy sound, most humans could react by covering their ears because humans hear sounds through ears, which is very natural gestures. When we think of natural gestures that people use inherently or unconsciously, there might be some cues for designing gestures that could be more universal.

What are the common gestures (reactions) that we use? Maybe I could start with gestures related to our eyes, ears, and skin.
How can I extract their implicit meanings and codify them?

My next step could be another study — observing people and discover their natural gestures in a certain context. This process could help me understand the implicit meanings of natural gestures of our body in specific contexts and be responsive to them appropriately. Then, based on the principles that I learned through this pilot project, I could further develop natural gestures as gesture interaction systems that most people feel intuitive or easy to learn.

Reference

Abner, N., Cooperrider, K., & Goldin-Meadow, S. (2015). Gesture for Linguists: A Handy Primer. Language and Linguistics Compass, 9(11), 437–451. doi: 10.1111/lnc3.12168

Falk, C. (2014, October 16). The Future of Gesture-Based UI. Retrieved from https://www.altia.com/2014/10/16/the-future-of-gesture-based-ui/.

Krauss, R. M., Chen, Y., & Chawla, P. (1996). Nonverbal Behavior and Nonverbal Communication: What do Conversational Hand Gestures Tell Us? Advances in Experimental Social Psychology Advances in Experimental Social Psychology Volume 28, 389–450. doi: 10.1016/s0065–2601(08)60241–5

Norman, D. A. (2010). The way I see it: Natural user interfaces are not natural. Interactions, 17(3), 6. doi: 10.1145/1744161.1744163

OSullivan, & Igoe. (2004). Physical computing: sensing and controlling the physical world with computers. Boston: Thomson course technology.

Saffer, D. (2008). Designing gestural interfaces. Beijing: OReilly.