Oops, smart home assistants make wrong predictions. What to do?

Designing and communicating “confidence” of smart home assistants

Published in

trialnerr0r

4 min readMar 21, 2024

This blog post is a condensed version of “Exploring the Design Space of User-System Communication for Smart-home Routine Assistants” published in CHI’21

Motivation

Smart-home routine assistants (SHRAs) are capable of predicting and automating household actions based on residents’ past behaviors. However, the communication between users and these intelligent systems remains underexplored. Without clear communication, users may harbor unrealistic expectations or become frustrated with errors, potentially leading to the abandonment of these technologies.

Methods

We conducted a user enactment study with 20 participants. User enactment asks participants to imagine and act out assigned scenarios. Participants were asked to imagine how they would react in various contexts such as being busy or tired, and reacted to SHRA notifications indicating different levels of confidence in SHRA’s automation prediction.

Results

Should SHRA ask for permission, or only notify?

People generally wanted to stay on top of the system’s automated actions, and wanted to be asked for permission or notified, rather than having the SHRA execute such actions “silently” or “secretly.”

However, whether participants preferred the SHRA to ask for permission, or simply to notify them, was dependent on other factors. These factors include perceived possibility of an action being wrong, desired level of user control, their own whereabouts, emotional states, etc..

How did participants understand smart home assistants’ “confidence” score?

Participants understood it as probability, possibility, likelihood, accuracy, certainty, and strength of association.

Did participants even want to learn about “confidence” score?

Participants did not always find confidence information was necessary. Here are some factors that affect their willingness:

Perceived opportunities to interfere with automated acts: If the system already atomized actions, e.g. too late for humans to intervene, participants had the tendency to not care.
The correctness of automated actions: Even when the confidence rate is low, as long as the prediction is correct participants in general don’t care. However, if the system had a wrong prediction participants would be eager to check how low/high the confidence is
Participants’ availability at the moment: If participants were, physically or mentally, unavailable for extra tasks, confidence scores are more often not in their interests.

What to participants do when they get wrong prediction, e.g. turn on AC when it’s very cold?

Assessment: Is the prediction utterly incorrect, or is it inappropriate?
Diagnosis: Try to diagnose how and why SHRAs went wrong. Participants usually considered the following factor: what information the system had collected, how it had predicted their intentions, and why the automated action was chosen.
Real-time feedback: “Hey, wrong prediction!” Participants would provide then provide real-time feedback to the SHRAs.

Did participants try to improve the system predictions?

Some participants did not expend effort on improving the system, either because they thought doing so was the sole responsibility of product developers, or because they regarded themselves as unable to improve it.

For those who did, the attempts fell into the following categories. configuration, teaching, simplification, and compliance.

Configuration: Adjusting SHRA’s confidence levels and reconfiguring its measurements and thresholds, like distance or time, and specifying locations on a home map.
Teaching: Re-training SHRA by repeating gestures and feedback to correct errors.
Simplification: Rearranging physical settings to make behavior-intention associations clearer for SHRA, like moving appliances or furniture for better detection.
Compliance: Repeating actions that previously triggered correct SHRA responses, like staring at cameras or deliberately using remotes to reinforce learned patterns.

Design spaces and suggestions

We propose the following four design spaces for SHRAs’ user-system communication ordered roughly in chronological order.

At Onboarding
1. Assessment: Evaluate household’s AI knowledge and mental models to tailor future interactions and expectations with SHRAs.
2. Customization: Allow customization of user-system communication, including notification preferences and automated action permissions, ensuring SHRAs adapt to different household needs.
3. Setup: Provide tutorials on feedback provision, expectation management, and system training to enhance willingness for system learning.
During Routine Prediction and Execution
Initially, SHRAs should seek permission for actions to enhance prediction quality and respect user autonomy, gradually gaining household trust. SHRAs should base permission requests on their confidence level, automating actions with high confidence and seeking consent when less sure. SHRAs should offer preset communication content and options for further explanation upon user request, especially after interventions in high-confidence actions.
Occasional tips, tricks, and facts
These would educate users and prevent early abandonment due to unrealistic expectations.
Administration
Provide a dashboard for users to review and understand SHRA actions and predictions, allowing for preference adjustments and advanced configurations.

Footnote

* For a more detailed explanation of how we craft the scenarios the participants were asked to act out, please refer to our full paper.