Creating Realistic Conversational Flows in Your Alexa Skills — Part 1
I happen to be in the midst of a very large Alexa Skill project. In fact, after 1 month, I am still toiling away. Writing code all day, typing at the speed of light, I am starting to enjoy the click-clack of the keys. In fact, the sound of the keystrokes are harmonizing perfectly with Chopin’s Nocturne in E-Flat Major.
It is with this experience that I have come to have a love/hate relationship with the Alexa Skills Dialog methodologies. In a way, the dialog is the most important aspect of an Alexa Skill. As you build out your Alexa Skill conversational dialogs, be aware of the pitfalls of certain approaches to working with the dialog.
Auto Delegation is a Good Place to Start
If you have some Alexa Skills development experience, you have probably heard of the Alexa Skill Kit dialog auto-delegation features. If you have recently created a skill in the Alexa Skills Developer Console, you will notice a new(er) item in the Interface section called Auto Delegation. If that option is switched on, your entire skill is set to automatically delegate all dialog tasks to Alexa. Of course, you can also control the ability to delegate the dialog at the intent level. Depending on the skill, this can be an appealing option. However, there are some limitations to this approach. Up next, I will briefly cover some the pros and cons (from my perspective) .
Auto delegating the dialog can reduce the amount of required handlers. For example, with auto delegation you only need to implement one handler per intent. Once you define your dialog in the interaction model, you can rely on Alexa to take care of the back-and-forth with the user. Let’s take a look at how a completely delegated dialog might be defined in our dialog model.
I will quickly summarize what is happening in the Alexa Skill Interaction Model above. This interaction model belongs to a fictitious skill that enables customers to search for a dog to adopt, by breed. Notice the main custom intent is the FindADogIntent. This intent has several custom slots (all defined in the model). Only one slot is a required slot. That is, Alexa will only prompt for the breed slot, if there is no existing value for that slot.
You will notice prompts and validations properties on the breed slot. We have 2 validators for this slot. The first validator kicks in if the customer provides a value from the validations.values array. The 2nd is the catch-all validator for all other values that do not match the values enumerated in the breedType custom slot type.
"value": "Afghan Hound"
"value": "Airedale Terrier"
It is my preference to build out the dialog model by hand, resisting the urge to use the Alexa developer console. When you manually write your dialog model, you gain a better understanding of how the pieces connect, and you have more control over naming conventions. For example, consider the following slot validation prompt ids:
"prompt": "Slot.Validation.154616755422.1503388152026.257948657476","prompt": "Slot.Validation.0.Intent-FindADogIntent.IntentSlot-breed"
The first prompt reference is not descriptive, and can become a problem in large skills.
The second prompt reference makes a lot more sense, and can be described in plain language. For instance, “The first validation prompt for the breed slot of the FindADogIntent”. This is a much better approach, especially considering the ordinal sequence-based nature of slot validation rules. That is, order matters when you list out validations rules for slots in the dialog model.
Assuming you implemented the interaction model above, you would only need 1 intent handler for the FindADogIntent in your code. The single handler would wait for Alexa to collect all the required slots, then send the completed request to the skill. It is important to understand Alexa will not send a request until she has completed eliciting all of the required slots from the customer. Let’s look at a sample handler that might handle a request in a skill that implemented the model above.
This handler is quite involved. The fictitious skill allows customers to adopt dogs, so naturally we need the customer’s location. The handler above attempts to retrieve the specified address information for the customer’s device. In addition, this handler is configured as a one-shot handler. This handler will be invoked on a launch request resulting from a one-shot utterance.
The most relevant method of the handler in the dialog context is the canHandle method, which is part of the handler interface. Here, we perform our logic to indicate whether this handler can indeed handle the incoming request. Notice, we are specifying a condition requiring a dialogState property to be COMPLETED. Indeed, with auto-delegation, our skill will only receive the request from Alexa when all required slots have been filled. This will set the dialog to COMPLETED state.
return ((request.type === 'IntentRequest' && request.intent.name === 'FindADogIntent' && request.dialogState == 'COMPLETED')
|| (handlerInput.requestEnvelope.request.type === 'LaunchRequest'));
To recap, all we had to do was write our model in the JSON file, and voila! Alexa handles the rest.
The auto-delegation method is great for certain types of skills. It saves us the hassle of writing code to either 1) handle the various requests when manually delegating the dialog; and 2) handling all the code and logic to completely control the dialog ourselves.
Unfortunately, there is a big problem with auto-delegation. If Alexa has no confidence in the response a customer provides for a specific slot, the skill session will automatically terminate, and your skill will not have an opportunity to provide a user-friendly response. That is obviously not the customer experience we want to provide in premium Alexa Skills. Thankfully, we can mitigate these problems by manually delegating the dialog to Alexa. We can really start to get sophisticated by simply handling the dialog ourselves.
To be continued …