Testing Alexa Skills — Autogenerated tests

Diego Torres Milano
4 min readFeb 22, 2018

--

You almost finished your Amazon Alexa Skill and are now started the quest for the Holy Grail of Alexa Testing. Now, you are desperately searching for a way to automate it. Even, googling it gave no obvious outcome.
Fortunately, your search is over.

Alexa Skill Kit

ASK CLI provides a way of testing a Skill via the simulate sub-command. This command uses the text contained in the arguments to invoke the Skill.

$ ask simulate -s ${sid} -l en-US --text 'ask book my trip to reserve a car'

Returns a SUCCESSFUL simulation response but all the Slots are empty as the lambda in this case is returning a Dialog.Delegate response.

Useful in a few cases, but most of the Skills will enter this situation and the command ends.

In Testing Alexa Skills — The grail quest we discussed some of the alternatives you have to test your Skill. We mainly focused on how to automate the process of running tests and how a conversational test can be created manually.

Automatic code generation

Now we will be analyzing how we can automate the code generation of such tests. Because some of the details needed to create the tests are available in the Skill’s Interaction Model, we will leverage this to reduce to a bare minimum the information that you have to provide to create a test.
You can also use these tests as a foundation on top of which you can create variations. Remember, that testing Voice UIs is not the same as checking other pieces of software. The user has an almost infinite range of utterances, and each user may say the same differently. But let’s focus on the most fundamental aspects of testing our Skill, which is assuming the correct utterances is detected, does the Skill behaving as expected?

If you have read Testing Alexa Skills — The grail quest, you may have now a better understanding of the alternatives. You may also have experimented with lex-bot-tester and created some conversational tests for your Skill.
You may also wonder if the process can be automated as it involves several manual steps. I wondered the same myself, and decide to extend lex-bot-tester functionality not only to allow you to provide the conversation to test but also to help you create those conversations.

lex-bot-tester is an Open Source project and the repository and installation instructions can be found at https://github.com/dtmilano/lex-bot-tester/

Enter urutu

Following a tradition of naming this kind of scripts after snakes (see Culebra), this is called urutu. Its scientific name is “Bothrops alternatus” and is a venomous pit viper species found in Brazil, Paraguay, Uruguay, and Argentina.

Enough introduction already, let’s go to the action.

$ urutu --helpusage: urutu [ — help | -H] {create-test [method-name [skill-name [intent-name]]]}

As help shows, we can provide some arguments like the test name, skill or intent, which if we don’t, they are requested.

Creating a test

Then, providing answers to the questions helps urutu to create the conversational test.

The samples and prompts are from the Skill’s definition, so another advantage is that you don’t have to go back and forth from the Skill model.

Right now, python is the only supported language for the code generation (more coming) but your Skill implementation could be in any language.

Adding the test to a class

We need some boilerplate to run it as a unittest.

Running the test

On your marks, get set, go…

Analyzing the results

The generated test doesn’t add any specific logic (yet) to check for any particular Slot value. It just verifies that all the Slots that require a value has one.
Of course, you can add some extra logic and add specific tests for your Skill.

self.assertGreaterEqual(int(simulation_result.get_slot_value('DriverAge')), 25)

In this case, we are checking the driver age is greater than or equal than a value.

Perhaps, the dates are more interesting cases and where we can really perceive we are dealing with Alexa. When we answer tomorrow to the What day do you want to start your rental? prompt, we actually receive a date, in this case 2018–02–22.

tomorrow = (datetime.date.today() +
datetime.timedelta(days=1)).strftime('%Y-%m-%d')
self.assertEqual((simulation_result.get_slot_value('PickUpDate')), tomorrow)

Feedback and contributions

This is a very early version of the tool. There is still room for improvement so I would love to get some feedback, comments and stories (and why not, bug fixes and contributions as well) from Alexa Skill developers.

Please use the issues section of the github repo if you want to report some issue or leave some ideas or comments.

--

--

Diego Torres Milano

Geek, Android System Engineer, Linux advocate and published author