Making Alexa send a SMS message
This week we built a simple Alexa Skill to allow it to send text messages.
Once we test drive it a bit more, we intend to submit it for approval by Amazon, until then, here are a few things we have learned so far.
The invocation sentence dilemma
While discussing this Skill concept with our Interaction Designers, a usability issue we keep hitting is the sentence structures that needs to be used. At first sight, you would expect a natural language interface to feel intuitive to users, but there are some barriers that are forcing users to learn how to interact with Alexa.
- First one needs to remember the “Alexa” name. Trivial for most, but it’s already some level of impediment for some user personas, for example: elderly people or people for whom English is not their first language.
- Then one needs to remember the invocation name of the Skill. As users of Alexa ourselves, we often forget the names of the Skills we installed, that is when we remember which ones we installed to start with. The lack of visual home screen makes opening a Skill harder than launching a mobile app.
- The ask/tell sentence structure further complicates things. While one can ask “Siri, send a message to Sarah, please pick-up milk on the way home.”, with the Echo you need to make it something like: “Alexa, ask text messages, to tell Sarah, please pick-up milk on the way home.”
To keep the sentence structure simple to start, we decided to only support one destination phone number. With a Skill name of My Friend, this allows a sentence like: “Alexa, tell My Friend to please pick-up milk on the way home”. We’re giving this a try with the assumption that even with a single contact, this type of interaction can be useful in some scenarios.
Understanding free form text
When we first looked at the Echo and the Alexa Skill Kit, we expect it to do Speech to text recognition, then text to intent mapping, turns out Alexa does it all in one step.
While this makes the recognition more reliable for deterministic cases; when you have a fixed number of key works you expect the user to say. It really makes it more complicated for the Skill designer when a more free form interaction is needed. As in this text message use case, or for a Skill that would want to integrate with a 3rd party task list service, allowing the user to send free form text to a Skill requires the developer to list sample sentences with an even distribution of likely word counts. Amazon recommends to provide several hundred samples or more to address all the variations. This task is very tedious for the general cases, and is not Skill specific; would be nice if the Skill Kit would provide a general purpose Slot type of this case.
This will be even more complex when Alexa will support more than one language.
Saying a phone number
The first time we made Alexa repeat a phone number, it went on to say: 6 million, one hundred and….
This provided a good example of how a sentence should be structured and formatted differently depending on whether it is spoken or displayed. In the case of a phone number, we are displaying (613) 555-1234 on the card of the companion app, while we send “6, 1, 3, 5, 5, 5, 1, 2, 3, 4” to Alexa as the number to be spoken.
To send the SMS messages, we used Twilio. The Twilio API makes it easy and relatively cheap to send text messages.
To get started, you need a Twilio number that is 1$/month, then it is only $0.0075 per messages. As long as the Skill does not go viral, that will be manageable.
Then to send a SMS, the API is as simple as you would expect, just need to specify the from number you purchased from Twilio, set a To phone number and the message, and you are done.
The other Skill we are working will allow to make phone calls which gets more involved, but for now it is pretty straightforward.
We explored sending text messages for a few scenarios where we think it could be useful, but not allowing the receiver to answer will likely be limiting.
If all goes well, we will submit our Skill for Amazon to review within the next week.