The Future of Conversational Experiences

John Bennett
8 min readMay 24, 2017

Overview

This is the second in a series of linked posts exploring the future of Experience Design. It looks at projected technical developments and adoption patterns of conversational experiences over the next five years. The next post will look more closely at how to design conversational experiences, with emerging best practice guides and a list of resources.

These posts are a work in progress, the next five years is going to see a series of huge changes, many of which are just coming into view and there’s no right answer, so please get involved, add comments, refine my thought, disagree, etc.

Conversational Experiences

Conversational technologies moved centre stage in 2016, with continued growth in chat, the launch of new voice interfaces like Google Assistant and Amazon Alexa, and the launch of over 900 chatbots for platforms like Facebook Messenger, Kik and Telegram.

This lead to some very grand claims have been made about the impact of new conversational technologies;

“Chatbots are the new apps.

Satya Nadella, CEO Microsoft

However, how true is this? What’s adoption of conversational technology been like? And how will the technology progress over the next five years? Will chat and other conversational experiences really replace apps or become part of our everyday lives. However, before we explore these questions, a little background.

Background

There are two main types of conversational experience:

  1. Human to Human
  2. Human to Machine

Human-to-human, e.g. click-to-chat customer service applications have been with us for over 15 years, though it’s only relative recently that they they’ve been widely adopted as chat becomes a more central form of interaction for mass audiences.

However, much of the recent interest in conversational interfaces over the last 18 months is driven by the rise of human to machine conversation. The rise of AI voice and chat interfaces is the direct result of improvements in the accuracy and quality of the AI technologies that underpin both.

There are two main conversational interfaces:

  1. Voice; and
  2. Text based chat.

Both can be seen as two sides of the same coin with voice the natural conversational interface and chat its visual counterpart. Like most of the technologies in this series both chat and voice interfaces have been with us for some time. It is only recently, however, that both have reached mass mainstream audiences with the rapid adoption of smartphones over the last 5–7 years and the more recent adoption of voice hubs like Google Home, Amazon Echo and now Apple’s HomePod.

Voice

Voice control of computers, homes, etc, has long been a staple of tech and futurist visions. The idea of being able to control and interact with your environment using just natural language is a compelling one. And voice recognition has been available, in one form or another for many years. However, it’s only recently that the the technology has reached mass mainstream audiences in the form of Siri, Cortana, Alexa and Google Assistant. Again this has mainly been driven by improved reliability and performance.

Text

The continued rise in the popularity of chat has seen developers push and expand capabilities beyond the simple exchange of messages. In Asian markets, where chat is a much more popular application, driven in part by the high carrier and SMS costs, applications have become the gateway to access a huge range of services. For example, WeChat in China lets users book taxis, reserve doctor appointments, check in for flights, buy cinema tickets, manage bank accounts and a host of other applications. We Chat is no longer just a chat application, it’s also a payment giant. And the last year has seen similar development in US and European markets — e.g. Uber’s integration with Facebook messenger was the first of a range of payment and other services Facebook has been integrating with its Messenger app.

Chatbots are not a new phenomena. However, the explosive rise of mass market consumer of chat applications like What’s App and We Chat along with advances in AI and Machine Learning are starting to move chatbots into the mainstream.

The Technology, Market and Adoption Over the Next Five Years

The next three-to-five years will see significant reliability improvements in underlying voice and chat technologies, with current barriers to voice adoption falling away. This will significantly grow the market size moving voice and chat from being applications to platforms that cross a huge range of services and use cases, and offer new forms of value exchange.

Improved Reliability of Technology

Reliability is one of the main reasons still inhibiting mass adoption of the conversational technology. It’s obvious to even casual users of Siri or Cortana or followers of Tay the infamously racist Microsoft chatbot, that these systems still have significant limitations.

However, over the next three-to-five years the reliability of the voice recognition and chatbots will improve significantly, driving adoption of conversational interfaces by mass mainstream audiences.

There are two different components to a conversation:

  1. Comprehension, i.e. understanding what is being said, and
  2. Response, i.e. the ability to process and answer a question or fulfill a request.

Of the two, comprehension is the most reliable. In 2013, Google’s voice platform had a word recognition accuracy rate of below 80 percent, in 2015 that had risen 90%. Baidu now claims a 95 percent accuracy rate, with their Chief Scientist, Andrew Ng (the ex Google Lead), believing that they will soon be at 99-percent accuracy. Ng believes this is a game changer, which will move voice from the early adopters into mainstream everyday use.

Most people don’t understand the difference between 95% and 99% accuracy, but those extra 4% are game changing

Andrew Ng, Chief Scientist, Baidu, ex Google Deep Mind

The extra four per cent is the difference between adoption and non-adoption for mass audiences who are a lot less tolerant of things going wrong than early adopters.

However, voice recognition is only half the story, even if the application can accurately understand what it is that you’re saying it still has to process that information and act on it.

And it’s here where the real barriers to adoption lie, many smart assistants have very high fail rates when it comes to response with the best — Google Assistant — only answering around 70% of all questions completely and correctly, and others, including Siri and Alexa only completing around 20%.

And it’s a similar story with chatbots, with AI and deep learning significantly improving their reliability over the next five years. Chatbots currently have average fail rates in response of around 40% — in other 4 out of 10 requests cannot be processed satisfactorily.

Adoption Patterns — currently barriers to voice usage will fall away

Voice is already a mainstream behaviour for surprisingly large, particularly young, demographics: in early 2016 Google revealed that 20% of mobile search queries are voice searches and there are a range of other current use cases where voice plays a significant part, e.g. luxury and premium cars. However, mass mainstream audiences have still not adopted voice technology, there are two reasons for this:

  1. Voice technology isn’t quite as seamless enough for mass mainstream audiences; and
  2. Cultural — mass mainstream audiences find talking to machine a little strange.

New more reliable technologies will change this with mass mainstream audiences adopting at home and in car voice technologies. This will in turn erode the cultural barriers, e.g. embarrassment at speaking to a computer to uptake.

Adoption will be driven by improved technology which will increase the number of viable use cases, which in turn make the application more compelling to wider audiences.

In the next three-to-five years voice will become a much more mainstream technology. Indeed, it may well be, that in certain settings it will become the predominant means of interacting with technology — e.g. cars and possibly the home.

Markets Grow Rapidly

Core conversational platforms continue to grow rapidly, e.g Kik, one of the most successful chat apps in the US, reports that it’s 300 million users spend an average of 12.7 minutes in a chat session with about six separate chats a day. This represents over an hour and a half spent with chat for the average Kik user. Many other chat platforms continue to experience similar growth.

Similarly, the global chatbot market is predicted to grow rapidly over the next three-to-five years. With growth running at over 30% a year taking it from roughly a $100 million market in 2016 to a $1Bn market in 2024.

And it’s the same with voice, with similar significant market growth over the next three-to-five years.

Amazon sold 8 million Echo units in 2016, by 2020 it will be selling over 40 million of the units per year. Google is predicted to sell over 15 million a year of its similar device by 2020.

The whole of the Intelligent Virtual Assistant market, which overlaps both voice and chat, will be worth $3bn by 2020.

Impacts: Voice and Chat become Platforms

Until recently, voice and chat have been viewed primarily as applications, i.e. closed stand-alone experiences. In the last 18 months Facebook, Amazon and Google, We Chat and others have started promoting both chat and more recently voice as much more broadly-based platforms: opening up APIs and doing a range of deals with third parties to integrate both technologies into third party applications.

Facebook announced at a new open Bot platform for Messenger at F8 in 2016. At CES 2017, Amazon unveiled what it calls an Alexa Everywhere strategy — with a push to integrate its voice platform into as many third party devices as possible.

And a whole new slew of start-ups have appeared to build on the new chat ecosystem with new technology platforms like Wit.ai for AI and natural language, Beep Boop for hosting and Slack integration and dozens of others.

It’s too early to tell which provider or providers will become dominant in each space, however, the move from applications to platforms will be very helpful for experience designers looking to create more seamless experiences across different touchpoints.

A Pragmatic View: The Bot Backlash

As we’ve seen, conversational interfaces, bots and voices technologies were very hot in 2016, and as they rolled out there’s been a predictable bot backlash.

Most technologies go through a similar Gartner hype cycle, and while current capabilities and implementations have almost certainly been over-hyped and over-sold, it’s important to look beyond early implementations and hype. For many businesses, looking at automated conversational applications, current error rates aren’t acceptable for deployment in front line services. Errors would just add cost by pushing queries and interactions to other channels, as well as impacting brand and NPS scores.

That’s not to say that implementing automated conversational applications is not worth considering right now. There are specific use cases, failsafes and approaches which allow them create a great deal of value (which will be examined in more detail in the next post in this series)

In particular, it looks like for the foreseeable future that human-to-machine chat will most likely compliment human-to-human chat rather than replacing it.

Conclusion

Issues with reliability of response will remain for several years at least and experience designers will have to carefully consider use cases and the fail safes when deplopying conversational interfaces. In five years time reliability will have improved to the point that conversational experiences will be a significant part of the way mainstream audiences interact with digital technologies.

However, there are also a number of good reasons why chat will probably not kill apps or websites, anymore than apps or social media killed websites. What seems more likely is that conversational experiences will improve and replace certain aspects of customer journeys, particularly in more private spaces like the home or car.

The next post in this series looks at the practice of designing conversational experiences in more detail, examining best practice and bringing a range of resources for conversational design.

--

--