The definitions and limitations of voice control for home appliances:

Martin G. Kienzle

Published in

The Future of Electronics

9 min readJan 24, 2019

there’s still plenty of work ahead to do

“No, Alexa, I don’t want help with that” https://www.flickr.com/photos/diversey/31501674177 Tony Webster

Voice assistants and smart speakers: battling for the smart home

There appears to be no escape from the voice assistants and smart speakers. Some forecasts predict market growth from $2.7B in 2018 to $11.8B in 2023. In September 2018, 18% of US adults were using a smart speaker monthly, and it was expected that by year-end 2018, almost 30% of US adults have access to a smart speaker, with 80% of those using the service monthly.

Voice assistants are the cloud services that receive voice commands from smart speakers and other devices, turn speech to text, interpret the resulting language, detect the intent of the command, and use “skills” to determine and execute appropriate actions; either through the smart speakers, or through actions of other devices in the home that are connected to the Internet.

Currently Amazon Alexa has the voice assistant market leadership, with Google Assistant catching up rapidly, and Apple Siri expanding as well. While each assistant depends on the speakers, the services they convey are the real value for their company’s business models. Amazon enables purchases from their commerce service, Google favors access to their services and uses the data for advertising, and Apple uses Siri to support their ecosystem of hardware and services.

In addition, 3rd party speaker providers, for instance, B&O, Sonos, and Harman Kardon, are offering their own smart speakers using voice assistant services offered by the three market leaders.

Each of the big three voice assistant providers wants to connect to a large ecosystem of connected devices, controlled through their voice assistants. Some of those ecosystems have expanded rapidly. Amazon claims more than 28,000 smart home devices that work with Alexa made by more than 4,500 different manufacturers, and over 70,000 Alexa skills. Google Assistant claims more than 10,000 devices from 1,600 brands.

An important question for consumers is to which voice assistant ecosystem they should subscribe. While Amazon Alexa and Google Assistant have a dominant lead in smart speaker-based voice assistants, Apple’s Siri is used widely with smart phones, tablets, and computers.

But are people using them to control smart appliances?

So far consumers appear to use their smart speakers mostly for entertainment and information services, with the responses conveyed by the associated smart speakers. Control of smart home devices does not yet appear to be a major factor in voice assistant use.

In fact, according to some surveys, the use of voice assistants with large appliances seems to be declining.

(green denotes increase, orange denotes decrease)

While the usage might be slowing down, smart speaker adoption has not. This has meant that appliance manufacturers must now support all major voice assistants or risk someone not considering their brand(s). To simplify the use of their appliances, some manufacturers are looking to use built-in microphones and speakers to connect them directly to voice assistant services in the cloud, eliminating the need for smart speakers. However, this trend is nascent, with no significant amount of use yet, according to analyst organization, VoiceBot.(1)

Yet the design to elimnate speakers has its own challenge — if appliance designers use dedicated voice chips for appliances, e.g. for Amazon Alexa or for Google Assistant, would a manufacturer need one for each VA they were looking to use?

When voice assistants are used with consumer devices, simple single-action commands are most popular. For instance, changing the temperature in a townhouse where thermostats on several floors can respond to a single voice command saves the user going up and down the stairs. This is clearly a convenience.

What is different about controlling major appliances?

Beyond the humble thermostat, manufacturers are connecting the balance of the kitchen — from refrigerators to ranges with Internet connectivity. However, our informal survey shows that only a small fraction of consumers actually connects their connectable appliances to the Internet. Fewer still enable voice assistant support. There’s no immediate consumer value from connecting. Controlling a range as part of cooking a meal requires far more complex interactions than setting a room temperature with a single command.

The disconnect between what consumers want and what manufacturers think they want in connected experiences was made clear in IBM’s Institute for Business Value. They conducted a survey of manufacturers’ executives and consumers, ranking the motivations for digital consumer experiences.

Executives might call a time out and rethink about they deliver what customers really want: more time, more convenience, with faster results and easier processes. Where does a speaker — or a voice command fit in? How — and who — does it help?

Voice assistants have some inherent limitations

In many ways, voice assistants simply replace button pushes with voice commands. It is still up to consumers to consider the context of an action. When multiple devices are involved in achieving a high-level objective such as cooking a meal, users have to orchestrate the device actions with each other, and with other activities required to reach the goal. The assistant isn’t smart, and it’s arguable whether telling a speaker to raise the oven temperature is overly helpful. Consider these points

Voice assistants use single commands. For now, these consist mostly of fixed phrases. Effectively, they push one button or set one dial.
As more flexible natural language understanding technology is becoming available, interpretations of speech commands may be become ambiguous. With commands resulting into actions, misunderstandings can be risky. Did I really want to set the oven to 600 degrees? Do we need “guard rails”?
Voice assistants support only one-way “conversations”. The appliances cannot talk back, asking for clarification of intent. Building checks into the skills executed in the cloud does not completely solve this problem.
The commands are independent of the state of the device. The user has to know whether an oven is on, when the heat should be turned lower, etc.
The stateless aspect of the voice commands also limits the ability to support action sequences if those actions depend on the state of the device. Have I turned on the exhaust before I turn on a burner on the stove?
Appliances generally cannot initiate conversations, or give alerts by saying, for instance, that the clothes washer is finished, or that the pot on the stove top is boiling over.
In many cases, only a subset of the appliance functionality is accessible via voice assistant. This can be due to safety reasons. A stove top burner should be turned on only when somebody is in the kitchen. Or it can be because a function is complex and depends on the state of the appliance, e.g. bring the water to a boil, and cook the pasta until tender.
Voice assistants cannot integrate context data, such as who is in the kitchen? is there milk in the refrigerator?
They typically do not remember history — how did we do this the last time?
They depend on an Internet connection, and the obstacles it has in each home that can make it less than reliable.

These shortcomings limit voice assistants from elevating their status from transactionary to really helpful. They need a semantic level of interaction to support more complex activities. The voice assistant actions cannot be tailored to handle specific situations in a household, such as considering who is home for dinner, or what ingredients are at hand. The lack of history inhibits the ability to learn about consumers’ preferences. And it prevents the automation of actions, such that they require no voice input or other input at all — making them more universal. Managing lights or temperature could become completely automated simply by human presence, without a single word spoken / heard / misheard / misinterpreted.

Safety, Security, and Privacy must be considered as well!

Privacy considerations are becoming an increasing barrier to consumers’ adoption of voice assistants. The voice data are transmitted to the cloud for analysis, often remaining there indefinitely, stored permanently.

While in general, only voice commands issued after the wake word are sent into the cloud, some mishaps illustrate serious risks. The associated software is inherently complex, occasional failures are to be expected. However, even in the absence of software failures, voice assistant data, in particular when combined with other data, creates a potentially serious privacy exposure; it’s an extremely detailed view of people’s home lives, as explained here, and here, and here. There are sayings about not airing dirty laundry on social media; however voice assistants may be much more intrusive. This this could lead to a backlash in the use of connected devices.

For now, we are just at the beginning of the journey towards mitigating privacy risks. Voice assistant privacy risks are just part of the larger public conversation about consumer privacy that appears to be gathering momentum. A great part of the solution will have to come from regulations such as the GDPR legislation in Europe. Fines are starting to be levied and Google was among the first and the largest. However, there also can be technical solutions that mitigate the risks and give consumers more control over their data. A recent proposal using blockchain technology to protect consumer data is an example.

Also consider one last point. Voice commands to devices are often “physical” requests… start an appliance or open a door. This also opens the potential for serious safety and security risks in addition to privacy. While the potential damage in home use is limited to that home, these risks are part of a broader set of risks associated with the Internet of Things.

True advances require integration of context

As the computing power in appliances increases and gets less expensive at the same time, much of the voice assistant function that today is performed in the cloud can be built into the appliances. This will improve both reliability and privacy. It will also give consumers more control, and support personalization and customization to specific contexts.

In order provide true convenience and ease of use, we have to use voice control in conjunction with other sensor inputs and external data sources. This is best accomplished by apps that have access to a broad set of data: the appliance state, all appliance functions, sensors in the environment, related devices, and external data sources. These apps also can develop and utilize personal profiles and histories. Keeping the data local improves privacy.

However, true advances occur not by simply translating button pushes and knob turns into voice commands. We need to use artificial intelligence to raise the semantic level of the interactions by focusing on the high-level objectives of an activity, rather than on individual operational actions. This shifts the focus from the appliances to the life-purpose for which they are being used. We need to start by rethinking how to accomplish objectives such as cooking a meal, maintaining clean clothes, saving energy, or keeping the home secure. Then we can determine what kinds of devices can best support those objectives, and how to use them in personalized contexts. We are seeing some manufacturers moving in this direction. A key challenge is to integrate voice assistants into these broader, more capable platforms.

This change of focus will help us to realize the promise of the smart home. Our next blog will look into how to accomplish this!

Martin G. Kienzle is the Electronics Industry Leader in IBM Research. His interests are Internet of Things (IoT) technology and business trends, IoT services and business models. Follow him on twitter @mg_kienzle for smart home news and perspective .
If you’ve gotten this far, you should definitely connect with Martin
Check out his latest thought leadership collaboration on 5G
See Martin’s work on use cases for hybrid cloud, on Life in the AI Age from August
on edge computing, in the spring edition of the Future of Electronics
end notes:

1 >> voice-assistant-consumer-adoption-report-2018-voicebot.pdf from voicebot.ai