Adding Custom Alexa Skills With OpenWhisk and Cloudant

Set up your Alexa skill to cost you less and scale well.

Alexa, the humanoid persona of the Amazon Echo and Echo Dot conversation appliances, can be programmatically extended by adding new skills to your device. There is an app store for such skills, but even if you want to add functionality to your own Dot, then you can use the same APIs without publishing your code to the app store.

Echo Dot

Create a sample Alexa skill

A skill is a number of moving parts:

  • An invocation phrase that is used to identify your skill in a spoken sentence. For example, you say “Alexa ask Tarot bot if it’s my lucky day.” Alexa wakes up the device and parses the rest of the sentence to see if it matches any built-in or custom skills. In this case Tarot bot might be the invocation phrase of your code.
  • A number of intents can be configured to let a skill deal with multiple scenarios. For instance, a tarot bot might detect the intent to have a single card drawn, or a full daily reading. Intents can also be configured to pull out one or more slots from the sentence, like a date, or a person.
  • Some code is executed as a phrase is detected. Alexa calls your code with an HTTP post and expects a JSON response, which Alexa speaks back to the user.

Configure OpenWhisk

Amazon recommends its Lambda serverless platform for coding custom skills, but IBM’s OpenWhisk is much easier to build with. (If you don’t already have OpenWhisk, read how to get started.)

Create a file myskill.js that contains a main function:

The object your action returns contains the HTTP status code you want to reply with (200), any HTTP headers. and a ‘body’ which is the data returned by your action. In this case, it is your JSON response turned into base64-encoded string.

Deploy it to OpenWhisk:

wsk package create alexa
wsk action update alexa/myskill myskill.js -a web-export true

You can then access this OpenWhisk action as a web-facing API call —

curl 'https://openwhisk.ng.bluemix.net/api/v1/web/MYNAMESPACE/alexa/myskill'

— where MYNAMESPACE is your OpenWhisk namespace. (To find yours, run wsk namespace list. It should look something like glynn.bird@uk.ibm.com_dev.) You’ll enter this URL in a minute when you configure your Alexa skill.

Add Alexa skill

Visit your Amazon Developer dashboard and click the Add a new skill button.

Under Skill Type, choose Custom Interaction Model.

Name the skill and enter an Invocation Name, which users will say to call your skill. (You’ll have a chance to expand on this in a minute.) Then click Next.

In Intent Schema, enter at least something like:

{
"intents": [
{
"intent": "myskill"
}
]
}

In Sample Utterances, enter phrases that trigger your skill and click Next. For example:

myskill my skill
myskill test my skill
myskill skill test

Under Endpoint select HTTPS, enter your OpenWhisk URL, and click Next.

Under Certificate for NA Endpoint select My development endpoint is a sub-domain of a domain that has a wildcard certificate from a certificate authority and click Next.

Test your skill. In Enter Utterance, type one of the trigger utterances you entered in Step 5, then click Ask test.

Response appears on the right. Click Listen to hear Alexa say the output speech.

Serverless frameworks like OpenWhisk are great for this application because you don’t have to pay to stand up a permanent HTTP server to deal with the trickle of traffic that your custom skill will generate. If, however, you decide to publish your skill and it becomes popular, OpenWhisk can scale automatically to deal with the demand.

Returning dynamic responses

I have a temperature sensor in my house that stores its readings in Cloudant periodically. So if I say “Alexa, my house temperature” I want to look up the latest temperature reading and report it back to Alexa as a readable string. Here’s how I do it:

The code is pretty simple. It uses a Cloudant MapReduce view that has the temperature readings sorted by date and time. It fetches the newest reading and then forms a JSON object containing the response.

I deploy it using a shell script, with my Cloudant credentials and database name stored in environment variables:

The ‘package’ encapsulates the credentials for Cloudant service so that you don’t have to “hard-code” them or pass them in as run-time parameters. The second line adds our action into the package and exports it as a web-facing API call.

The result is: when I say “Alexa, my house temperature”, it replies:

“The temperature is 21.5 degrees Celsius”

Further work

I could extend the code to detect other intents:

  • Tell me the coldest overnight temperature
  • Give me the average temperature in the last week
  • Convert the units if I say “Alexa, my house temperature in Fahrenheit.”

Build your own Alexa skill with OpenWhisk and let me know how it goes. To share this article on Medium, just ♡ it. Thanks!