3(ish) Things I Learned Making My First Alexa Skill

My first attempt at a skill, a simple Arnie quote machine

Recently I attended an Alexa Dev Day run by Amazon — it was a really well organised event and props to Amazon for providing free sessions for developers up and down the country. The day went from what Alexa is and why she is to how to develop a skill using the SDK, ending in an open Q&A/hacking session.

By the end of the day I had created my very own Arnie quote machine which would give the user back classics ranging from “Get your ass to Mars!” to “Who is your daddy and what does he do?”

The following is a rundown of what I found developing an Alexa skill for the first time.

What killed the dinosaurs?… the Ice Age! — Arnold Schwarzenegger

3 Things

Slots

Slots are what Alexa uses to pass variables through to the skill functions, I found them straightforward to use apart from one thing. Using a slot as a search term is counter intuitive. From the outset it’s put forward that when you create a slot you also give it a list of words that can be accepted, for example if you’re creating a weather app you might have a “weatherType” slot and give Alexa a dictionary of “sunny”; “cloudy”; “rainy”; etc. But what do you do if you want to get the value of the slot regardless of whether it’s in the dictionary? Try this on for size, “Alexa ask Weather App if there’s going to be a rainbow today” but “rainbow” isn’t in your dictionary.

Turns out the answer is very simple, Alexa doesn’t have to match a word in the dictionary to get to your function, she’ll just go on ahead and pass it through regardless allowing you to go ahead and search for rainbows. So this whole point is moot. Except, no. If I came across this and banged my knee rolling my chair under the desk there’s a possibility others have done the same. I think the fact that when you create a slot you have to give it values makes this a bit confusing and having those optional would solve this, or just making it super clear in the instructions surrounding slots.

Another thing that caused me to groan about slots is the preference for just one word. If you try to tell her multiple things within a slot she won’t like it very much, with a phrase such as “rain clouds” she would only pick up on “rain” and pass that onto you. This could cause issues if you’re building a What Movies Do I Own skill and you need to differentiate between “Die Hard 2” and “Die Hard 3”, no one wants 3 without completing the Christmas duo first. A workaround for this would be to find all matches for the string and have Alexa return those or if you want to do something with the movie afterwards have her ask the user if that is the one they meant.

Hey, Christmas tree — Arnold Schwarzenegger

Session

The session in Alexa is super useful for persisting data across multiple asks or across multiple invocations through integration with a database. However, she can only remember a certain amount of data which can become irritating if you’re trying to save on expensive API calls by storing what you get back. Her responses cannot exceed 24 kilobytes — that’s the whole response, mind — so the data you can store in the session will be slightly less than that. Storing larger datasets in the database could be a workaround for this but if you’re building a skill that doesn’t persist between invocations this might not be ideal.

Publishing

Using trademark assets in your skill will cause issues when going through the publication process, be that text or images. So it’s a good idea to either make your skill a bit more general if you wanted to create a skill based on trademarked content. For example, “Google Facts” could instead become “Tech Business Facts” and include interesting things about other companies as well. You could of course avoid trademarked material altogether by creating skills using information in the public domain or drawing from the well of your mind.

Speaking of publishing…the documentation around this can be a bit confusing, mostly when it comes to inputting sample utterances. Some examples are provided but we found that following these could sometimes be queried by Amazon when they review the skill. Happily Amazon will just tell you what to set these to in their feedback and as long as they don’t contradict themselves, which happened with us once, you can set ’em and forget ‘em.

Iced that guy — Arnold Schwarzenegger

(ish)

Invocations

Amazon enforces not having one-word invocations, unless you have a super good case for it. Just a good thing to keep in mind when designing skills.

Web

Alexa is a little disconnected at the moment; different areas among the kit look drastically different — most notably the interaction builder which, while super slick, presents itself like another product. We were told that Amazon are halfway through updating the whole thing so it should get better soonish but for now switching between these views and Lambda, if you’re using that, can become quite jarring and slow down the whole process until you get more used to it.

As long as development remains in the browser the process isn’t going to be great. Happily there are command line tools so you don’t have to be stuck flipping between tabs in the browser and logging in again every time your session expires. This goes hand in hand with thinking about things like separation of concerns and how that can apply to the skill development process as well as any other common coding principles.

See you at the party, Richter! — Arnold Schwarzenegger

Once you’ve built one skill the next few will be a breeze and this is one of Amazon’s strong points, it’s just SO easy to make skills. Seriously. The ecosystem they have in place means that it’s seamless running your NodeJS on Lambda for the Alexa SDK to talk to and adding on extras like a database to store user data are super easy using DynamoDB. It’s the range, strength and easy integration of apps within Amazon Web Services that pushes Alexa to pole position in the ambient computing race.