How a growing category of voice-plus-screen devices can help overcome the difficulties of shopping on smart speakers. By Camille Bourdier, Peter Gasston, Yogi Patel, and Caroline Wilson

Peter Gasston
Sep 26, 2018 · 5 min read

A recent report by The Information (🔒) claimed that just 2% of Amazon Echo owners have made a purchase through the device, with only 10% of those making a subsequent order. The Voice Shopping Consumer Survey from and Voysis contradicted this, saying that around one quarter of smart speaker owners have tried voice shopping, with 16% shopping monthly — although consumer surveys tend to favour ‘early adopters’. Amazon itself has said ‘millions’ use Alexa to shop, but the majority of purchases seem to be low value everyday household items, like AmazonBasics.

In its defence, shopping on voice-only devices is a new behaviour, which can take time for consumers to pick up on — it took years for online shopping to take off (seven years after the dotcom crash, global eCommerce was worth $15bn, compared to $2tn in 2017) and then years more for mobile shopping to become popular. But even with that considered, it seems clear that there are certain drawbacks to buying on smart speakers.

The path to purchase can be broadly broken down into four stages: awareness (realising a want or need), consideration (evaluating different options), decision (making a choice from those options), and the purchase itself. The Information’s report showed that a greater number of users, around 20%, used their smart speakers for more broad shopping-related queries that don’t involve purchase, including “what are my deals?” and “where is my order?”, and this indicates the greater opportunity for voice shopping.

The drawbacks of voice-only shopping

The awareness stage is the hardest for voice-only devices to crack. It often involves starting a search in Google or Amazon and receiving large amounts of information that needs to be evaluated. Voice-only devices aren’t suitable for this as the lack of visual interface makes them better at giving back shorter, more precise amounts of information. Alexa’s design guidelines recommend speaking no more than three options to a user, as they tend to forget the options available to them when more are provided.

The same problem happens at the consideration stage, where the user starts to filter their options, which again is hard when being presented with only a small amount of information at a time. And the decision stage is where the user turns their options into a choice, which often involves validation from customer reviews, information about discounts, delivery, and returns.

Finally, purchase is difficult on voice-only devices when the customer doesn’t have the reassurance of seeing their basket and chosen delivery options, or have the feeling of security provided by pressing a button in the final but critical step of completing the order.

What voice-only could offer

Rather than the ‘traditional’ shopping experience, what voice-only interfaces offer are convenience, a conversational style, and personalised recommendations. These could be used throughout the first three stages, by combining awareness and consideration using personalised recommendations. Rather than ‘here are all 50 of our shoes’, it would be better to say ‘here are 3 pairs of shoes similar to ones you’ve bought before, that your friends like, are in size, and in fashion’. More data and better understanding gives the user more confidence to make a decision, and validation that the right choice has been made.

This still doesn’t overcome the drawbacks of the purchase stage in voice-only devices, however; but the recently-introduced and fast-growing category of voice devices that include screens will help with that.


The voice market is bigger than just smart speakers; a recent report by Strategy Analytics said that half of all smartphones are likely to have a digital assistant by the end of the year (with some 50% of those having Google Assistant). And the smart speaker market is rapidly expanding to include screens — from the Echo Show and Echo Spot, to Alexa-enabled Fire TV and Fire Tablet devices, as well as the recently announced Google Assistant-powered Smart Displays from Lenovo, JBL, and Sony (and it’s a fair bet that Google will announce their own device in that market in the future).

Adding a screen to a digital assistant helps with awareness, as the screen can show more results than can usefully be remembered in a voice-only interface. It can also help with consideration, and — crucially — with purchase. Seeing the items in the shopping basket, along with the delivery details and costs, and having a button to push to confirm purchase will all aid in giving the user the feeling of security they need to make a purchase.

This doesn’t mean voice and screen devices will replace apps or websites — when you have a lot of options and choices to make, the ability to browse at leisure will always win out. But adding data-based recommendations to screens for validation and reassurance will certainly help the growth of voice shopping.

Choosing the right categories

The example used earlier in this article involved shoes, but in the short term it could be the case that apparel and certain other categories, like holidays, are a step too far for people new to voice shopping. Unlike household essentials, these categories are more than functional, involving a lot of emotion in the decision-making process. Until a system can know you exactly well enough (which may be never) to give the perfect options it will be difficult to persuade customers to buy.

Where there is a stronger role to play, perhaps, is in services like cinema or concert tickets. These tend to have stronger data about your preferences, as well as calendar and location data to make choice easier. ‘Find a concert near me that I might like’ is an easier call if we have, for example, Spotify history and location data, and the decision-making process for a one-off event is less emotional than for clothing or other non-essentials.

A path to voice shopping

While purchases through voice-only smart speakers may be low now, there are opportunities for voice to play a role in other stages of the shopping journey. It relies on a system being smart enough to know more about the customer, and to provide them with a few immediately relevant options rather than asking them to browse through ‘pages’ of options.

The introduction of smart speakers with screens, as well as digital assistants on phones and tablets, can help provide reassurance at the point of purchase and validation of choices, but should still be combined with data to provide recommended options for maximum impact rather than competing directly with other forms of online shopping.

And until customers are more used to shopping through voice or assistants, it’s likely that more success will be found with less emotional goods and services.

This article is based on the outcome of a recent rehab hackday, part of our regular programme of innovation sprints exploring the fit of new technologies with emerging user behaviour and business problems.


a creative technology company

Peter Gasston

Written by

Innovation Lead. Technologist. Author. Speaker. Historian. Londoner. Husband. Person.



a creative technology company

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade