Where to start with creating voice apps

Charlotte Qazi
Voice Tech Podcast
Published in
7 min readJul 23, 2019

I recently decided to see what I could do with Amazon’s Alexa Skills development platform. Here’s what I found — and how to get started yourself.

Photo by Rahul Chakraborty on Unsplash

You don’t have to look far to see how prevalent voice assistants are becoming. Voice assistants on smartphones, desktop computers, and home appliances have reached mainstream adoption, and other applications are set to follow. You’ll find voice recognition in your car and even in the lab, and the list of applications is growing exponentially. By 2021, voice assistants will outnumber the world’s population, at around eight billion devices. By 2022, voice shopping is projected to account for $40 billion in combined U.S. and U.K. consumer spending, up from $5 billion today. This creates a big opportunity for developers — someone needs to build the applications behind this new technology.

I recently found myself with some spare time to learn some new things, and this gave me a great opportunity to learn something new. Given the increasing ubiquity of voice, I chose to learn Alexa Skills Development — the process behind creating new features for Amazon’s voice assistant.

Beyond the obvious industry trend towards voice, I wanted to learn Alexa development because I’m personally very excited by its potential to make life easier for people, rather than complicating it further as so much other technology inadvertently seems to.

Photo by Status Quack on Unsplash

I love being able to walk into a room, say “Alexa, turn on the lights” and voila! No more stumbling around in the dark to find the light switch, dropping handbag and groceries in the process.

I own an Alexa-enabled Echo Dot, meaning I could potentially create functions, or skills, (known in Alexa-land as “Skills”) for my own use. (I don’t own a device with Google Assistant, although, for those interested, it’s also possible to develop voice commands for this too.) The potential for assisting people while their hands are busy is so powerful.

I’d heard rumors that voice assistant development was relatively straightforward from a Codebar talk I attended, so I thought I’d give it a try.

I’m excited to report that the rumors were true: Alexa Skills development is easy! To put my own technical ability into perspective, 18 months ago I worked in sales and marketing and I’d never written so much as a line of code. I was pleasantly surprised when I was able to build my first Alexa Skill in one day and have it distributed on the Skills Store in less than a week — just in time for Christmas!

My first Alexa skill!

Getting Started

If you’re like me you’ll want to do some research before jumping in. But after navigating my way through several different Alexa tutorials, I found that they weren’t very intuitive to use — and some were even out of date. So, in an attempt to provide others like myself something a little more readily useful, I wrote my own.

If you do choose to write code, it can be as simple or as sophisticated as you wish. In most cases, you can work with prewritten tutorial code and alter it for your own purposes. To show you just how easy it is, the user interaction of an Alexa skill can be set up simply by filling in a form on the Alexa Developer Console. Here’s what that looks like:

A breakdown of an Alexa user interaction

You simply give your skill an “invocation name” in the Alexa Developer Console and offer it a few utterances. All that is required is typing these into the correct boxes in the Console — no code necessary! Here is a current screenshot of the Alexa Developer Console, with pink boxes indicating the fields required for invocation name and sample utterances:

The Alexa Developer Console

Alexa Skills development is an excellent way to introduce yourself to the world of software development and build confidence. You can create a Skill without writing any code at all, so even rookie developers can get the hang of it. Because the coding is so straightforward, it also acts as an excellent introduction to other more advanced elements of software development: Lambda functions, test-driven development and distributing skills, for example. The Alexa Skills Kit and Alexa’s GitHub are great resources that offer code you can simply paste in. You can also use some code I wrote here.

Build better voice apps. Get more articles & interviews from voice technology experts at voicetechpodcast.com

Adding Your Skill to the Store

In my month learning Alexa development between projects, I created five Skills, most of which you can find in the Alexa Skills Store. To get your finished skill in the store, all you have to do is fill out a form and submit it. Within a few days, you will hear back from Amazon as to whether your Skill has been accepted. If there are any issues blocking your Skill from being accepted, Amazon will offer advice and assistance on how to fix it.

Personally, I didn’t have any problems having my Skills accepted into the store, although I probably should have guessed that “Netflix and Quiz” wouldn’t be accepted by Amazon as my Skill’s invocation name (!). The Alexa testing and distribution process is very clear in helping you fix any bugs.

Once you’ve built your skill, the Alexa Skills Console makes skills very easy to test what you’ve created, be it online through the console or through your own Alexa device. I found hearing Alexa speak the words I’d programmed a huge buzz, and this definitely spurred me on to build and learn more.

One of the biggest frustrations with working on Alexa Skills is the same that users often cite: voice recognition isn’t always perfect. So when you’re choosing voice commands, it’s a good idea to make sure they’re easily recognized: “Serial Quizzer,” for example, wasn’t the best choice in hindsight. Currently, the biggest challenge for voice assistants is filtering out background noise, which leads to misunderstandings. Alexa devices have seven microphones, and the signal is processed multiple times in order to filter out as much background noise as possible so that what’s left can be processed as individual sounds to make up the phrase which will then be processed by the Skill.

How Alexa Works

The fact that Alexa devices can translate your words to JSON, run them through a function and send back spoken word is a pretty remarkable function. It’s a huge task to teach a computer to “speak” and sound like a human. Imagine the trillions of different inferences we make in one sentence. Then, multiply that by the hundreds of different ways we could say the same thing, with another million different pronunciations, accents, voice pitches and tones. Now, imagine teaching a computer every single version!

Voice assistants use “Natural Language Processing,” which is a combination of artificial intelligence and computational linguistics. NLP trains the software to recognize how humans speak so it can respond more accurately over time. The more users speak with their voice assistants, the more data can be used to train them and the more accurate recognition becomes.

When the Alexa device hears the wake word (in most cases “Alexa”, although if your wife or housemate is called Alexa, you can change this), it records and cleans your audio and then sends it to the cloud, where it deciphers what you said, comparing each individual sound, your pitch, pronunciation, and frequency with those on a giant database. These recordings are stored and used to improve and grow the language database. Therefore, as more users submit sample recordings, voice technology drastically improves.

The Future of Voice Assistants

I’ve found developing for Alexa to be fun and rewarding, and I’m excited to see what the future holds for voice assistants. The voice element adds an extra dimension to writing code and, even if you bear in mind the current hardware limitations such as microphone placement, it’s still entirely possible to minimize these issues by thinking carefully about what you ask Alexa to say and respond to.

The potential for voice assistants is so much greater than “Alexa, turn on the lights.” There are so many opportunities where being able to get information without the use of your hands is genuinely useful, and for anyone who has spotted those opportunities, Alexa moves from a useful device that might supplement your smartphone to an integral part of your every day in its own right. Adoption of voice assistants will only continue to grow in the future, and there are so many opportunities for developers, from smart home technology to mobility to retail and beyond.

With this in mind, I’d recommend at least dipping your toes in the water of voice assistant development. Getting on board early will not only help you to understand what’s possible but also influence the direction of development and play your part in defining the future of voice.

Further Reading:

Something just for you

--

--

Charlotte Qazi
Voice Tech Podcast

#WomanInTech — Senior Engineer at BCG Digital Ventures — General Assembly London Alumna