Offline voice assistants — Can modern smart homes also be private?

Jon M
Product AI
Published in
4 min readMay 20, 2021

Smart voice assistants like Google Assistant and Alexa have opened up exciting new worlds of home automation to even those with very little technical ability. It’s as simple as saying “Hey Google,” or “Alexa”, and a world of information, intuitive tools, and smart automation are at your beck and call.

Outside of the cost of the original device, it’s entirely free. But as the saying goes, if you aren’t paying for a product, you are the product. The reason these devices can function so well, is because of the data they harvest. Both Google and Amazon leverage what their many millions of users say to better train their vocal assistants. Those algorithms have gotten remarkably good at deciphering all manner of languages, dialects, turns of phrases, accents, and conversational habits that make their vocal assistants intuitive, natural, and easy to use.

The price is your data and the things you say.

For many this isn’t worth paying, but it leaves them behind the technological curve, unable to enjoy the benefits of smart voice assistants in the same way. This has led them on a search for an offline voice assistant. They’re looking for one where the data processing happens locally, no voice recordings are saved, and no data is shared.

You have to DIY… for now

Unfortunately for those intrepid adventurers, options for effective virtual assistants that protect your data are quite lean and decisively do-it-yourself. Snips was a commercial virtual assistant that was built with privacy in mind, but Sonos bought it up and soon after shut down its user console for customization, and there’s been very little on it since.

The Home Assistant hub has a combination of Almond and Ada (text processing and speech to text) tools, which are designed to work together to deliver the same kind of experience as Google Assistant or Alexa, but without all the data sharing — at least they will be, probably. They’re still early in development, and updates are few and far between since their original announcement over a year ago.

Rhasspy is an open source project that shows great promise, operating entirely without an internet connection and supporting various small form-factor systems like the Raspberry Pi, but you’ll need to configure the commands yourself. It’s one of the more complex custom virtual assistant solutions, and is recommended more for experienced developers than novices.

Jasper is the easiest to get up and running with, requiring the least knowledge of the bunch. It runs on a Raspberry Pi and comes with extensive documentation to help you make your own apps and skills for it to take advantage of.

The most off-the-shelf solution coming sometime in the next few years is Mycroft. This smart speaker and home display has all of the necessary hardware and software built in, is based on open source software, and handles most of its necessary processing locally. It does have some cloud connectivity, though, and it’s gone through a lengthy kickstarter process which is only now coming to a conclusion after multiple years.

Making money in the market gap

The problem with all of these solutions is that they aren’t really adequate alternatives to Google Assistant or Alexa. They aren’t easy to use, customize, or take advantage of. You can’t give them to your grandma and expect her to know how to use it. They’re not intuitive in the same way that other vocal assistants are.

The fact that these projects exist shows that there’s a gap in the market for someone to take advantage of. The difficulty is in competing in it. Google and Amazon, and Apple, to a lesser extent, have extensive resources to operate monstrously powerful machine learning systems that make their voice assistants fantastic at what they do. That means they can be affordable too, because the processing power doesn’t have to be local, and the data their devices capture is a revenue stream in its own right.

Any company that wants to develop a virtual assistant that can do what Alexa and Google Assistant devices do, but without data harvesting, will need to be more expensive, and may require extensive training with users’ voices when first acquired. They may also struggle to get the support from smart accessory makers who don’t want to retool their devices to support yet another niche vocal assistant.

Smart digital assistants that don’t need heaps of user data to survive aren’t an impossible future, but it could be a distant one. Especially if you aren’t willing to program much of their function yourself.

--

--