Everything You Need to Know About Voice Tech

How far along is the technology today? And where should you be investing for tomorrow?

Myplanet
Myplanet
Sep 9 · 11 min read

Voice interaction is one of the most dynamic, rapidly evolving areas of tech right now. And with such an accelerated rate of change, staying on top of what’s happening in the field is essential.

As an Actions on Google partner, an Alexa for Business partner and having built several custom voice applications, most recently in collaboration with one of the leading healthcare providers in North America, we’re deeply immersed in the field. We know, first-hand, what is and isn’t possible today — and where to expect things to shift, change, and grow in the near to medium future.

Having recently attended Voice Summit, where we had the chance to compare our knowledge and views of voice interfaces and the technology that supports them with some of the biggest and most exciting names in the voice tech space today, we wanted to take a closer look at some of the key elements of voice tech. Here are the five key areas we’ve identified and what to expect from each in the coming months:

  1. Building Conversations: What are the best tools, approaches, and techniques for building conversational interfaces? And how is the market able to support the rapidly changing needs of creators and users alike?
  2. Managing Conversation: Content is king, as they say, but the kingdom is growing exponentially and managing content in voice-first experiences is an entirely new area. How do we craft content management experiences that don’t wind us back to the days of endless developer requests to fix or change things?
  3. Analytics and Testing: Voice is novel, certainly, but novelty wears off. How can we be certain it’s working to meet the business needs it’s intended for and the user needs it’s designed for? How do we measure the ROI? How do we learn how well our voice apps are performing and moreover, how can we make them better?
  4. Natural Language Tools: How do we enhance the voice-first user experience? Are there NL tools that can take the voice experience to a more meaningful, engaging conversational place?
  5. Security: Perhaps most pressingly, how can these interactions stay safe? How can we ensure that personal information isn’t accessed by the wrong people, shared in the wrong way, or otherwise exposed in our voice interactions?

In essence: How do we, as creators, handle all these moving parts? And how can we possibly stay ahead of the curve? Let’s dig in.

Building Conversations

Even a casual observer knows that at this stage, Amazon and Google own an enormous portion of the voice market. But as we saw at Voice Summit, there are plenty of upstarts ready to carve out their own niches. And what that tells us, is that we can expect to see some pendulum swings in terms of platform adoption.

In the earliest stages of voice entering consumer spaces, the major players dictated how to create the experiences. Few organizations could afford the time and overhead to experiment, so the big players with the exact right infrastructural underpinnings for innovation and trial-and-error testing gave it a go and forged the path — securing first-to-market rights and early expertise as they did so.

But as we shift from experimentation to early adoption, businesses looking to enter the voice market are starting to invest in creating their own options, casting off the tight parameters of the early major players. Now that voice has secured a steadier toehold, large enterprise organizations that want (and need) to stay ahead of the curve are eager to create an experience that meets their specific needs and lets them own the experience, end-to-end.

Voice started far off to one side— a few major innovators controlling the creation and implementation process. Then the pendulum started to swing far in the other direction as the technology matured— individual orgs creating bespoke experiences, allowing them to control every aspect of the experience. But we’re starting to see the market swing less dramatically in the other direction again already. (We told you it moves fast.)

“It can be baffling for organizations to try and figure out which tech will actually add value to the org — and realistically be around tomorrow. We help gauge the field.” — Everett Zufelt, Director of Technology Services, Myplanet

Open-source options are already emerging that allow for an easy, significantly more cost-effective build, but still offer some flexibility for customization. This kind of hybrid creation will become the norm, just as we saw with content management systems over the last 20 years or so.

This will have a major impact on a few key industries, in particular: voice can be tough to tackle for any business, but add in the complex regulatory requirements found in finance or healthcare, and it can seem impossible. It makes a whole lot of sense, then, to partner with a smaller platform built specifically to address those needs. With pre-built parameters set around industry compliance, but flexibility elsewhere in the execution, these options can be harnessed without fear of legal issues.

That will be a crucial shift for the voice industry as a whole — empowering some of the biggest organizations with the largest consumer reach to connect with their customers in this way will help solidify voice as one of, if not the, go-to interactions.

Managing Conversations

Of course building the voice app is only half the battle. Once it’s out in the wild, there are plenty of other considerations— like how do we handle all the content that’s being created for this stream?

Many organizations have a vast amount of content they’d like to repurpose for voice, but it’s not as easy as having a voice app simply read out pre-existing articles. How we craft content for voice is fundamentally different. Even more complicated are the challenges around not just authoring, but implementation.

“There’s no Wordpress for voice, no CSS for voice. There’s no way of making something pleasant quickly.” — Ian Moss, Design Lead, Myplanet

Unlike the current state of most content management systems — where a content manager can go in and, without much difficulty, shift and adjust the content to update it, finesse it, etc.—voice still requires quite a bit of developer involvement, by and large. And moreover, even small changes require heavy manual effort to make them sound and seem natural. Updating a conversational string isn’t so bad, updating the intonation, the pauses, the way the dialog comes across as a whole? It’s still fairly labour-intensive on the part of a developer.

Managing content for voice experiences is not easy. And if we think updating conversational flows is hard, the complications multi-modal options (like Amazon’s Echo Show or Google’s Home Hub) bring is even more challenging. And the market for those options is growing.

The good news is, as we shift from the experimenters to the early adopters in terms of market saturation and readiness for the technology, we’ll have greater resources to draw from for conversational management. Learnings will start to emerge, we’ll start to get more relevant and steady feedback from users, and that will give us a better, more concrete path to how we can continuously improve these interactions. Management can feel like a bit of a wild, wild west at the moment, but as the technology matures, so will its oversight.

“People are mostly in a voice only scenario today, but it’s not a bad idea to be designing for a market to come — especially in such a rapidly changing marketplace.” —Everett Zufelt

Analytics & Testing

The creation and management of conversational experiences are both important things to consider, but they won’t mean anything if the value of conversational experiences can’t be proven. Which is why, as the market share for voice interactions continues to grow, so too will the need for meaningful ways of measuring the ROI on those interactions.

The massive scale of Voice Summit is indicative of something that’s gone beyond a trend. Voice isn’t some flash in the pan, not with the amount of interest and investment it’s garnering. There’s a reason experts and analysts are willing to bet big on this kind of interaction; voice can be a powerful tool to connect with users and will, inevitably, payoff for folks who make the right plays. But to determine which elements of it are working and which aren’t, we need effective testing and analytics tools.

Already we’re seeing companies like Dashbot emerge as leaders in this area. Providing actionable feedback on voice experiences will not only ensure businesses get the value they seek out of the investments they make, but also ensures their customers are able to use the applications to their full potential.

Understanding where in a conversational string the system breaks down is a valuable insight, as valuable as understanding where in a checkout flow on a website a user struggles to complete a task.

In some ways, a conversation can feel more constricted. But in reality it’s a much wider, more open experience. We can’t control the parameters as tightly, which means feedback is even more important for understanding our users’ needs and pain points.

“With visual interfaces — websites, mobile apps, etc. — we construct most of the mental model for the user. But with voice, users have to construct the mental model themselves. It creates a new set of challenges, both for users and for the businesses behind the experiences.” — Ian Moss

Developing these experiences costs money and time. Testing them even more so. New platforms are being developed to ease some of that burden, but it requires an entirely new way of approaching testing to ensure enjoyable experiences for end users. And none of this comes cheaply, so figuring out how to measure their success will be paramount going forward, and partnering with the organizations who are doing it right (like Dashbot) will be essential.

Natural Language Tools

Creating functional experiences that can help users accomplish their goals (whether it’s to purchase new sneakers, learn more about how to manage diabetes, or find the address of their lunch meeting) is obviously the first and most important step to successful voice integration. But once that hurdle has been cleared, there is infinite room for creating enjoyable experiences.

A key indicator of maturity in voice experiences will be how naturally we’re able to communicate with the applications, how seamlessly we can incorporate them into our already established routines and patterns.

Direct, didactic commands are already well in hand with voice apps. A user can ask “What is the weather in Toronto today?” and reliably get an answer. But processing at the pace humans are able to, and with the same ambiguity humans can handle, is a whole other ballgame.

At Myplanet, when we think of where voice can have the greatest impact, our minds naturally turn to workplace experiences. It’s our specialty and our passion. So for us, having natural language processing (NLP) that could facilitate smoother meetings, ease the overhead on menial tasks, and empower employees to do their jobs better will be the real harbinger of success.

Imagine if you could blend an interaction with a voice assistant into the already existing conversational flow, as though the app really was an active additional member of the team? Calling on it to pull up information from past meetings, to transcribe current meetings, to update status reports and to automatically email notes and next steps? Advances in NLP will make this possible, and it will redefine how we get work done in almost every setting.

“We carry a lot of technical baggage into any new tech implementations in how we expect things to work and something like voice is different enough that it probably isn’t true. We need to expand our idea of what’s possible.” — Ian Moss

As the technology continues to be refined, it will also make global communication easier. A greater ability to parse different accents, speech styles, and tones; better recognition of directionality and the ability to filter background noise; automatic translation tools — these will all help power international business like never before.

There are so many avenues and areas that natural language technology will grow—processing, understanding, and even generation—that it can be almost impossible to comprehend just how drastically it will change the field. Things we can barely fathom today will be normal in the not-so-distant future as true voice assistants empower us to work and live with ease we’ve never known before.

Security

We’ve saved the (arguably) most contentious factor for last. Security is no small issue. Not a day goes by but you hear about the latest data breach in the news. And the complications voice brings — how to verify a voice? How to limit access to private information from those who shouldn’t have it? — are real and important to address. But we’re further along than we tend to think.

Voice authentication is already a huge and rapidly advancing field. No doubt, this is at least in part because it’s a primary concern for any major business planning to bring voice applications into their environment. Before voice can be deployed in corporate settings, it’s important that the layers of security and oversight available in our traditional software models (multi-tenant structuring, for example) are available in our voice experiences as well. It’s a key component of the usability of the technology. And where businesses have needs, investments get made and advances occur.

Of course, it’s not just business contexts where this is a concern. As the technology moves into the financial and healthcare industries in particular, but also into a variety of consumer contexts, users need to be certain that their personal information isn’t being accessed or broadcast to strangers. (And if there is a reason this might need to happen, users need full knowledge of it and the ability to withdraw permission.)

As we saw at Voice Summit, the market is picking up for addressing these concerns. Once again, as we move away from experimentation into proper early adoption, security issues become more pressing and small companies with the ability to respond nimbly to specific concerns are starting to answer the call.

“Concepts like our Hello: Unbank voice financial app — with the various authentication requirements and security concerns — are easier now than they would have been a few years ago, because the tools are able to help support in a way they weren’t before.” — Jason Cottrell, CEO, Myplanet.

In spite of the concerns, and the growing pains, and the unknowns still lingering in the space, the advantages of voice are undeniable. For individuals with mobility challenges or those with low or no vision, being able to securely access information via voice will be an enormous boon. (Our work with CNIB to create a voice-first application for content creators is proof of how challenging and necessary this work is.) Elder care options open up significantly too, with fears around isolation minimized when an active, connected, voice-responsive device is available.

Across the board — from security to content management to testing — we’re seeing the voice tech field go from first steps to next leaps. Specialty providers are finding the gaps and filling them rapidly with tools and supports that ensure the technology is robust enough to live up to the hype of the last few years.

“We’re excited to see these next steps. First, because there will be more companies ready and willing to take on the challenges. And secondly, because the experiences not only get more common, they get better. The tooling to support them gets better, the security more evolved and capable of handling greater complexity. Voice tech just becomes more available and able to handle bigger, more robust use cases. And that’s going to be a very good thing.” — Jason Cottrell, CEO, Myplanet

The oldest millennials — already nearing their 40s and middle-age — are the last cohort to recall a pre-digital world. Kids entering school today are unlikely to remember an era before voice apps were the norm. Understanding how this field is evolving isn’t just a good idea, it’s necessary to the growth and stability of every organization.


Wondering how you can bring voice into your workplace setting? Unsure if you even should? Talk to our team today and see how Myplanet can help you prepare for the changes on the horizon.

And thanks for reading. Be sure to 👏 and share!

Myplanet Musings

Thoughts, ideas, insights, and more from the Myplanet team.

Myplanet

Written by

Myplanet

We're a software studio. We make smarter interfaces for the workplace.

Myplanet Musings

Thoughts, ideas, insights, and more from the Myplanet team.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade