Smart Speakers: A.I.’s Vehicle to Large Scale Acceptance

Published in

ChasingProducts

9 min readMar 25, 2019

Smart speakers are wireless speakers that can be controlled through a computer or a smartphone and consist of a microphone and speaker array, on the hardware side, while the voice assistant provides the all-important brains of the operation. Over time the power of these devices will come from the ever-evolving capabilities of A.I.-powered voice assistants and all the functionality they provide to consumers’ lives. The physical devices used to access the knowledge of voice assistants are just vessels, whose mission is to serve as tangible interfaces between man and A.I. while, optionally, also providing some basic, inherent functionality.

21% of US adults own a Smart Speaker, 26% of which were purchased in 2018 alone. That amounts to a total number of 119 million smart speakers sold in the US alone, a 78% growth compared to a year before. This is all happening while the increasingly saturating smartphone market shrank 6% in the third financial quarter of 2018 alone.

If we overlay the popularity graph of the two leading smart speakers in the past 2 years with the adapted model of a technology’s adoption life cycle, we can assume with a high degree of certainty that the technology has passed the dreaded Chasm and is on its way to mass-market adoption. And through the popularity of smart speakers, the AI-powered voice assistants have become more acceptable in our everyday lives.

Smart Speakers in the context of New Technology Adoption patterns

Amazon launched its Alexa-powered Echo devices for the public in June 2015, giving it more than a year head start compared to Google and 2.5 years to Apple’s product. The Echo’s biggest strength comes from what Amazon calls ‘skills’, a rapidly expanding list of third-party voice applications that essentially supercharge the capabilities of any Alexa-enabled device. Current skills allow anything from ordering pizza to controlling your lights or calling an Uber ride, any company is free to develop their own ‘skill’ by using the Amazon-provided Alexa Skills Kit API. To further ensure the popularity of its product, the Alexa is available to be integrated by any product manufacturer, for free.

Google released its first generation Google Home smart speakers for the holiday season in 2016, powered by its proprietary Google Assistant. Like Alexa, Google Assistant also gives developers the ability to create their own apps via ‘Actions’ and allows hardware manufacturers from any field to integrate it by way of an SDK.

While its competitors are focusing on the ‘smart’ aspects of their devices, Apple has chosen to stay close to its hardware design roots and lean heavily on the ‘speaker’ side of things. Apple boasts a best in class sound quality on its HomePod compared to anything else in the space, better even than products from veteran manufacturers of connected speakers, like Sonos. Siri functionality in the HomePod, however, has been bound closely to a few Apple-proprietary apps and music services along with very basic search duties. To make matters worse, any kind of audio streaming to the HomePod has to come via AirPlay¹, from another Apple device. Needless, to say, this strategy didn’t pan out for Apple.

When the 2018 iterations of both the Google Home and the Amazon Echo launched a lot of people were confused by the complete lack of backwards compatibility support for the home automation protocols popular at that the moment. It soon became clear to anyone looking closely enough that these companies were planning to become the next home automation protocol. This would be achieved not with physical antennae communicating on different frequencies but by providing a platform upon which third-party developers can build their own products, services or indeed other platforms. Attracting these third-parties will create the synergies that make platforms so attractive to all parties involved, where the success of one hinges on the success of the others. We can look at platforms as being factories for services. In this sense, whichever voice assistant-making company ends up building the best factory, one that enables its willing participants to do their best work, will win in the end.

Amazon looks to have taken full advantage of the time it had before the competition decided to show up. Alexa now has 56.750 skills versus Google Assistant’s 4.256 actions. However, Google is trying to close this usability gap fast, focusing on more high-quality, high-impact actions from third parties and bets on using its own knowledge base in order to serve its users the best experience.

At the moment, most consumers prefer to use their voice assistants through the medium of a smartphone, with smart speakers in second place and the car in third. This gives Apple and Google a major advantage who can already benefit from the billions of devices that are already out there.

Amazon has to move fast and make some inroads towards expanding its ecosystem of supported hardware devices to as many as possible. While it can’t force smartphone manufacturers to integrate its Alexa software, they can aim for currently under-served markets that have a lot of potential growth opportunities.

Top 6 devices consumers want to have assistant integrations next.

If these consumer requests are any indicator for developers’ future integration efforts, it looks like we’re heading towards a world where voice assistants are integrated inside every appliance in the household, plus a deeper integration in cars.

In the current environment, the case for choosing a HomePod is thin at best. If you’re not already invested in Apple’s ecosystem -in either hardware or the services side- the HomePod is useless from the get-go. If you own an Apple device you need to be willing to restrict all your future smart tech purchases to ones that integrate with either Siri or HomeKit. This is a big ask for anyone, considering the alternatives provided by Echo and Google Home.

The near future will find each company in the voice assistants space focusing its product development in a direction that best addresses its own customers, on one side, and profits coming from their main business, on the other.

Google will optimize its voice assistant product in a primary direction that allows it to serve more Ads to its consumers. This can be done by leveraging some of Google’s existing ad-driven services like YouTube or simply by selling voice advertising for certain keywords to whoever is buying. A study on more than 1000 consumers found that two in every five consumers find voice ads as being less intrusive and more engaging than other ads.

Amazon’s main goal, with any hardware product they launch, is to increase the number of orders its users place in its online store. In order to do this, the company needs to remove as much of the friction involved with the process of making an order by way of voice as possible. In executing this strategy, they will try to 1) increase the number of Echo devices in any single household by making them cheap enough, 2) improve their speech recognition A.I. and increase the number of languages it supports to match the number of countries Amazon.com operates in and 3) offer price cuts for products ordered from Amazon.com via Echo, to attract new users. Amazon has been making big leaps in the online advertising business currently dominated by Google and Facebook, reporting a revenue of $3.4 billion for its advertising arm in Q4 2018, a growth of 95% compared to the year before.

Siri is the black sheep amongst the voice assistants, Apple knows this and it’s not sitting around hoping for the problem to fix itself. The next logical step is improving the sentence parsing algorithm to a point were search results are on-par to the competitors’ solutions. Next in order, if Apple’s ambitions for the HomePod go beyond its current condition -of promoting Apple services- and compete with the rest of the pack, they will have to open the ecosystem to third parties. Assuming that audioOS is robust enough to allow all of these changes to be made in over the air updates, Apple could suddenly become the one smart speaker to rule them all.

As a rule of thumb, tech companies can afford to have a closed ecosystem if 1)all your competition provides products that are ‘not good enough’ to satisfy the needs of most consumers and 2)the product you provide is better than anyone else’s. In this situation, consumers will be willing to switch to the better product. The HomePod, with its confined voice assistant, is most certainly not in this position.

A.I. -of every kind, not just the voice-based variety- will evolve to a point where it’s more of a utility than a proprietary piece of software that we see today. Similarly to electricity providers today, there will be massive A.I. providers that will grant you access to their A.I. from anywhere in the world, in an effortless manner, via the internet. As soon as you plug in a dumb device that has support for their platform, you’ll be able to select your A.I. provider of choice and start benefiting from the functionality it provides. AISPs (Artificial Intelligence Service Providers) will have to build their data-hungry A.I. algorithms over massive networks, supported by cloud computing, and this, in turn, will give them all the benefits that come with the network effect. As the value of each AISP increases the bigger their service gets, so do the benefits for the end consumers, thus creating loop of near-endless growth. This will make it increasingly difficult for new companies to enter the market and compete, resulting in 2–3 providers that will rule the A.I. providing world.

What I’m very intrigued about is the high potential presented by the smart speakers+AI-powered voice assistant combo. As manufacturers add more ways of interacting with the physical devices and as advancements in machine learning develop, we can expect some major innovations coming from third-party developers.

New sensors, like more sensitive microphones, could give the device superhuman hearing potentially giving it the capability to detect the tone of your voice and with the help of the A.I. it could also deduce your emotional state. The same hardware could also detect early signs of illness, assuming figuring out that you have a cold before you do by listening to your breathing and frequency of a creeping cough. Add a camera to the mix and the A.I. could easily evolve to identify falls, domestic abuse and other hazardous events like fire, smoke or flooding.

Like any other tool, this too can be used for nefarious purposes. An A.I. could be trained to recognize specific activities going on by analyzing the sound. Privacy will continue to be an issue going forwards as governments will fill in the regulatory gap at their usual slow pace but big companies -with proportionally big reputations to defend- will tread more carefully in the future.

The future looks bright…

¹While you can stream audio from your device while playing content from a different service -like Spotify or YouTube-, you will not be able to tactile or Siri voice commands to control the content played there

Smart Speakers: A.I.’s Vehicle to Large Scale Acceptance

Written by Razz Calin