VOICE has arrived

Marina Piller
5 min readJul 30, 2018

--

#VoiceFirst Summit 2018 key takeaways and presentation highlights

The inaugural 2018 Voice Summit (https://www.voicesummit.ai/), sponsored by Amazon Alexa, took place in Newark, NJ from July 24 - 26th. It was the largest event in tech voice with 2500+ attendees, 200+ speakers, 200 companies, hundreds of presentations and workshops. I was honored to be sponsored by and represent the Women in Machine Learning and Data Science (http://wimlds.org) community.

Voice Summit 2018 Opening Remarks

Some key takeaways from the conference are:

  • We are in post-mobile, #VoiceFirst world now
  • The new voice interface creates a brand new ecosystem, and with it, ample new opportunities to redefine customer experience, build new types of tools, apps, businesses
  • Voice is becoming a huge growing market with 43 million smart speakers as of Spring 2018 in US alone, and 3 million in Canada
  • Customers’ expectations have shifted in the AI/NLU-powered world; it makes sense to examine how your particular customers’ expectations (might) evolve and create a #VoiceFirst strategy now
  • There is a wealth of research opportunities as new voice data is becoming available
  • Privacy and confidentiality concerns are only beginning to be defined
  • New research finds voice application share similar security vulnerabilities as other modalities
  • Voice apps are easy to build. Great voice apps are harder to build
  • There are over 45K Amazon Alexa skills as of this conference, primarily in the consumer space but still very few in the business applications
  • There are a lot of monetization opportunities in voice
  • Voice will become the de facto interface for user interaction

We are officially in the post mobile-first world. With the wide growing adaptation of voice assistants, such as Amazon Alexa, Google Home, Microsoft Cortana, Apple Siri we have entered the #VoiceFirst era. This new VUI creates an almost entirely uncharted ecosystem, presenting a widely open field of opportunities, much in the same way the iPhone did years ago.

Regardless of your industry or career, it is worth beginning to learn about the voice revolution.

There were over 175 events at the conference. Here are some of the highlights from the presentations I was able to attend.

The opening keynote speaker, David Isbitski, Chief Alex Evangelist from Amazon, highlighted the customer expectations’ shift to ‘voice-first’ in the world of products powered by AI and Natural Language Processing (NLP.) Amazon believes voice presents the next major disruption in computing. 2018 Amazon Prime Day’s best-sellers worldwide were the Fire TV Stick with Alexa Voice Remote and Echo Dot. As of now, there are tens of millions Alexa enabled devices.

For beginners wanting to build voice applications the point of entry is relatively low. There were several workshops focused on the introduction to voice apps.

Jeff Blankenburg, Alexa Evangelist, presented a hands-on ‘Build Your first Product’ workshop. https://alexa.design/isp-lab-1

Image courtesy of VoiceSummit.ai

Jeff also gave a talk on the 10 most successful patterns for a good voice application.

They are:

  1. Do One Thing Really Well
  2. Make Your Name Memorable
  3. Focus on Intents, not Commands
  4. Simplify Choices
  5. Use the One-Breath Test
  6. Include a Variety of Responses
  7. Handle the Unexpected Gracefully
  8. Make Enhancements based on Data
  9. Provide Contextual Help
  10. Beta Test with Real Users

You can find details and more Alexa developer resources, such as Alexa monetization information, on Jeff’s github https://github.com/jeffblankenburg

Angel Wong, a Technical Product Manager at Dexter, presented a voice skill building workshop using Dexter’s free tools. I have not had a chance to try this platform yet but if you’d like to check it out, more details are at http://docs.rundexter.com/walkthroughs/overview/#toc

Juston Jeffress and Jedidiah Esposito from Amazon presented a fantastic 90 minute Alexa Design Workshop: Design for Conversation. The course is available for the first time! You can find it at alexa.design/cdw

You can find other tutorials and more great information at the following links:

alexa.design/dialog-management

alexa.design/guide

If you’d like to follow Amazon developers, they’re on twitch as well as twitter

twitch.tv/amazonalexa

Twitter @AlexaDevs

Arizona State University (ASU) showcased their re-imagining of higher education through the use of Amazon’s Alexa Voice Services. U.S.News and World Reports has named ASU the most innovative school the last 3 years in the row.

Diana Mingles, of Capital One, discussed what the state-of-the-art and opportunities are for the development of Natural Language Generation (NLG.) As Natural Language Understanding (NLU) has advanced over the last several years, NLG has lagged behind. NLG, nevertheless, is an essential component of successful conversational agents. Diana feels the future of NLG in Dialogue depends on:

  1. Integration of expertise and knowledge from Software Engineering, Discourse and Linguistics
  2. Definition of metrics and characteristics for a better, dynamic and contextual NLG by leveraging discourse and psychology research
  3. Availability of a more fine-granular and detailed NLU understanding
  4. Creation of out-of-the-box NLG solutions that can be integrated into chatbot development frameworks
  5. Formulation of design, development and testing strategies for a new generation of contextual and dynamic NLG components
Image courtesy of VoiceSummit.ai

Dave Witting, a Partner at RocketInsights.com, talked about the top mistakes his creative consulting agency made in the Voice market. They are:

  1. Designing a big voice experience
  2. Trusting Alexa vs sticking to your script
  3. You’re building on top of moving target
  4. You’re building on top of shifting ecosystem
  5. Content preparation takes forever
  6. You have to give users a reason to come back
  7. Thinking users will just discover your creation

Nicholas Carlini gave one of the most fascinating and chilling talks on voice security in ML. He demonstrated that the same kind of attacks made on images through Generative Adversarial Networks (GANS) can be made in voice applications. Here is a demo: https://www.youtube.com/watch?v=HvZAZFztlO0

For more on this topic, check out

http://www.hiddenvoicecommands.com/home

https://nicholas.carlini.com

Finally, Voice Summit was a wonderful reminder of how early we are in the evolution of this technology. As LEGO’s James Poulter said, ‘ Think about iteration, not innovation.’ The best approach to designing and building voice applications right now is experimental and agile. So go ahead, give it an iteration. You can build your very own first skill in just a few minutes.

James Poulter’s presentation during Voice Summit 2018

--

--