The main reason this post is click bait is that the title. There is a single flaw in a single product. Suggesting that leveraging cloud computing for deep learning is a bad idea because of this one instance is hyperbolic. It is like suggesting that because Ford had a flaw in their antilock break system that having antilock breaks is any car is a bad idea. Mistakes will be made. It does not mean we stop trying to make improvements.
And doing what is being done with these virtual assistants is is not possible on the device. The “hot word” is used to activate the communication with the cloud to transform speech to text to prepare a response. Doing that speech to text conversion on device for a wide range of use cases, languages and regional accents is just not possible right now. Understanding context with the request for that user requires a lot of data and compute resources. A mobile device would have to use a lot of resources to make this possible which is better to offload to the cloud. And since users have multiple devices (iPhone, iPad, Apple TV or Google Pixel and Google Home) syncing this machine learning data across these devices would be costly in storage and bandwidth. And duplicating it would be wasteful. Having it collect in a single place in the cloud where it can continually improve makes more sense.
All that data is used to train the AI so that is why it needs to be there. When a request fails it can be analyzed. It may have failed because the user’s pronunciation, which is a regional accent, was not understood by the speech to text system. When a collection of common requests with that regional accent can be used to better train the AI it benefits all of the users.
Your chosen title suggests we should limit smart assistance to running only on the local device. That means amassing all this data to train the AI would not be available on the backend. Thousands of requests with the same regional accent would not be available in one place. It also greatly increases the baseline requirements for the device handling the request, which increases the customer’s cost to get the device, which is what makes it unaffordable. This means fewer people would have access to this technology which should benefit everyone.
This post also relies on fear, which is another common trait of click bait. I have seen these articles before which suggest these devices are letting people listen in on them. We have had landline phones in our homes for decades. Those are very easy to hack. Those phones have microphones connected to a network. And now we all commonly have smart phones which we carry with us all the time everywhere we go. They have cameras and microphones and a constant connection to the Internet. Is that also a bad idea? Should we unplug our landlines and place our smart phones into Faraday bags? The answer is that the benefits are greater than the risks and steps must be taken to ensure the technology we use is as safe as possible. But spreading fear about an entire category of new technology from a single mistake is just not helping.
How to tap a home landline: https://youtu.be/LWdvpl8Q4fQ