Improving voice user interface while ensuring privacy

Published in

Layhill L. Tech

4 min readJan 27, 2018

“man in formal suit standing while holding white balloon” by Andrew Worley on Unsplash

Voice user interface is going to be one of the ways we interact with our devices as we go about our daily lives. It is just a very intuitive way for us because we communicate primarily via voice with text and images to complement.

But there still are various problems that need people to work on them to improve the overall experience. One of it is related to how the AI behind voice user interface can interact with us more naturally, like how we interact with fellow human beings.

A premium Medium article written by Cheryl Platz got me thinking about that. It also covered a little on privacy and why it is a contributing factor that make it difficult for current generation of AIs to speak more naturally and understand the context when we speak. Unless, companies don’t give a shit about our privacy and start collecting even more data.

In this article, I am going to share what I thought could help improve the AI and ensure user privacy.

Current Implementations and Limitations

What an AI needs to be better at understanding and responding in ways most useful to us are processing power, a good neural network that allows it to self-learn, and a database to store whatever it has learnt.

The cloud is the best way for an AI to gain access to a processing power and huge enough database. Companies like Amazon and Microsoft offer cloud computing and storage services via their AWS and Azure platform respectively at very low cost. Even Google offers such services via their Compute Engine.

The problem with the cloud is reduced level of confidence when privacy is involved. Anything you store up there is vulnerable, available for retrieval through security flaws or misconfigurations. Companies could choose to encrypt those data via end-to-end encryption to help with protect user’s privacy but the problem is the master keys are owned by said companies. They could decrypt those data whenever they want.

Or you could do it like what Apple did with Siri, storing data locally, and use Differential Privacy to help ensure anonymity but it reduces the AI capabilities because it doesn’t have access to sufficient amount of personal data. Two, Siri runs on devices like Apple Watch, iPhones and iPads, which could be a problem when it comes to processing and compute capabilities, and having enough information to understand the user.

Although those devices have more processing power than room-sized mainframes from decades ago, it’s still not enough, energy-efficiency and capability wise, to handle highly complex neural networks for better experience with voice user interfaces.

Apple did try to change that with its A11 Bionic SoC that has a neural engine. Companies like Qualcomm, Imagination Technologies, and even NVIDIA are also contributing to increase local processing power with energy efficiency for AI through their respective CPU and GPU products.

Possible Solution

The work on the hardware by companies should continue so that there will be even more powerful and energy efficient processors for AI to use.

In addition to that, what we need is a standard, wireless-based protocol (maybe bluetooth) for the AI on our devices, irrespective of companies, to talk to each other when they are near to each other and in our home network. This way, the AI on each of those devices can share information and perform distributed computing, thereby improving its accuracy, overall understanding of the user, and respond accordingly.

A common software kernel is also necessary to provide different implementation of neural network a standardized way of doing distributed computing efficiently and effectively.

So now, imagine Siri talking to Alexa, Google Assistant or even Cortana via this protocol and vice versa.

Taking privacy into account, information exchanged via this protocol should be encrypted by default with keys owned only by the user. Any data created or stored should only reside on device also encrypted and nowhere else. Taking a page out of Apple’s playbook, the generated keys should come from some kind of hardware-based “Secure Enclave”.

To further improve the neural network, Differential Privacy should be applied on any query or information sent by the AI to the cloud for processing.

Conclusion

The above is really just a thought of how current the AIs powering voice user interfaces can be improved.

At the end, it’s really up to the companies to decide if they want to come together and improve all our lives taking into account our privacy and security.

Improving voice user interface while ensuring privacy

Current Implementations and Limitations

Possible Solution

Conclusion

Written by Brandon Lim