Data@Hand: Combining Speech and Touch for Data Exploration on Smartphones
How can we facilitate visual exploration of personal data on a small smartphone screen?
This article introduces a mobile application named Data@Hand, which leverages speech and touch interaction to foster visual exploration of personal data on mobile devices. This work is a collaborative effort of Young-Ho Kim (University of Maryland), Bongshin Lee (Microsoft Research), Arjun Srinivasan (then at Georgia Tech), and Eun Kyoung Choe (University of Maryland).
Smartphones have become a dominant way to view and access personal data. People track their activities and physiological metrics using smartwatches and wearables, and review the data in the companion apps on smartphones. For example, people can browse their activity data captured by their Fitbit device in the Fitbit mobile app, and the data captured by Apple Watch in the Apple Health app. Mobile health apps commonly incorporate interactive visualizations such as bar and line charts to deliver comprehensive trends in various health metrics.
However, smartphones have small displays: they can’t display much information at once and don’t have precise input methods such as a mouse. For these reasons, visualization support in mobile health apps is limited. For example, they usually restrict people to view data by predefined time segments, such as one week, one month, three months, and one year.
As a result, it is difficult to browse through a large amount of activity data on smartphones. People’s questions on their personal data often involve time components — e.g., “How has my step count changed before and after the COVID-19 Pandemic?” or “How does my sleep pattern of April 2021 differ from that of April 2020?” To answer these questions, we need to jump between different time periods, which is tedious to do on a smartphone.
How can we effectively facilitate “personal” data exploration on a small smartphone screen? In this work, we propose Data@Hand, a personal data exploration app that leverages both speech and touch modalities to facilitate flexible exploration. Our approach is to use natural language for specifying various interaction parameters such as time and activity types. Natural language is flexible to cover different ways of specifying dates and their ranges (e.g., “October 7th”, “Last Sunday”, “This month”). Specifying time by speech can be much easier than using screen widgets like a calendar (or time) picker. Also, it is straightforward to specify multiple parameters in a short phrase. For example, we can jump directly into a specific activity by uttering, “Show heart rate in March.” This command specifies both the activity type and a period at once, which is likely to take longer to do by graphical widgets.
Motivated by the recent advancement in multimodal interfaces in the areas of data visualization and HCI, we combined speech and touch interaction to leverage a complementary synergy of the two input methods. Speech is flexible and fast for specifying long commands with multiple parameters, whereas touch is faster for pointing on a screen element or executing simple actions mapped with gestures. What if we can use both at the same time? With Data@Hand, we can execute a speech command while holding on a screen element such as chart marks and labels, indicating the context of the command.
For example, we can change the start date by simply uttering a specific date while holding on a start date label. Also, while comparing two different periods, we can quickly modify one period into another by holding on a chart mark.
Using Data@Hand: The Study
To understand how people — who are not necessarily experts in data analytics and visualization — use Data@Hand, we observed 13 long-term Fitbit users exploring their Fitbit data using Data@Hand for about 20 minutes. Participants actively mixed speech and touch throughout their exploration. Among all operations participants performed, around a half of the operations included the speech modality. 70% of the speech commands involved time manipulation, implying that time manipulation was a strong facilitator of participants’ visual exploration.
As long-term Fitbit users, participants expressed excitement about viewing their data using multimodal interaction.
“…it was cool to say like ‘around this year’ or ‘this month’ and it would get what you were talking about, whereas I think it could be hard to do in Fitbit.”
They liked the flexible time navigation and comparison features Data@Hand offered, describing time manipulation in natural language to be fast and flexible.
Toward Better Exploration of Personal Data on Smartphones
Our work contributes the first mobile app that leverages the synergy of speech and touch input modalities for personal data exploration. Our study revealed that participants could quickly learn and adopt multimodal interaction toward fast and fluid exploration, which suggests potential opportunities of our design approach. However, research in mobile data visualization is sparse, despite the proliferation of mobile devices as a tool for personal tracking and data exploration. We believe that multimodal interaction leveraging speech and touch is a promising approach to address the inherent limitations of smartphones’ small screen space.
For more information, please check out our full version of the Data@Hand paper, our project website, and source code:
- Paper published in ACM CHI 2021 (Honorable Mention Award) [Link]
- Project website [Link]
- Source code on Github [Link]
- Young-Ho Kim, Bongshin Lee, Arjun Srinivasan, and Eun Kyoung Choe. 2021. Data@Hand: Fostering Visual Exploration of Personal Data on Smartphones Leveraging Speech and Touch Interaction. In CHI Conference on Human Factors in Computing Systems (CHI ’21), May 8–13, 2021, Yokohama, Japan. ACM, New York, NY, USA, 17 pages. https://doi.org/10.1145/3411764. 3445421