The hype around Artificial Intelligence is just that, for at least three reasons. First, we are very far from an artificial brain that interacts with us in a reliable and articulate manner. Second, few people have the skills to develop a useful business case; programme the device or cloud environment; and, design an appealing conversational experience. (I am as big a fan as the next person of simple, funny skills such as the “cat” skills, but at some point Alexa has to provide solutions to hitherto unexploited gaps in the market). Third, most companies will be tempted to rely on Amazon, Google et al to launch AI products and services. In this blog series, I share MobiCycle’s experiences working with Amazon Alexa’s certification team.

In my previous blog post, I outlined difficulties we experienced when submitting updates to our skill for certification. The chief reason for our discontent is the lack of support. There simply is no human willing to engage with MobiCycle regarding the feedback we have received, despite our numerous requests. This lack of engagement translates into lost time and wasted resources, which is burdensome for very small companies.

Today’s problem: It is no longer possible to launch Electronic Advisor. The tester unilaterally withdrew our skill over sample phrases; i.e., “phrases to help users get started and access your skill’s core functionality.” Three of the tester’s sample phrases that they decided to create form the basis for their decision. Here are MobiCycle’s actual sample phrases:

  1. Alexa, launch Electronic Advisor
  2. I have a toaster
  3. I want to sell

By contrast, the tester created and tested the following phrases:

1. User: "Alexa, launch/start/open electronic advisor" / "Alexa, ask electronic advisor that I want to sell" / "Alexa, tell electronic advisor that  I have a toaster"

Amazon’s certification tester did not faithfully test MobiCycle’s sample phrases. Testers should not design their own sample phrases because they may not have been coded into the skill — thereby defeating the whole point of a sample phrase. This tester’s behaviour begs the question, Where is the supervisor and/or quality assurance?

The tester seems to imply that each phrase produces the same failed outcome. “Alexa, launch Electronic Advisor” would not fail unless the Echo were faulty. “Alexa, tell electronic advisor that I have a toaster” is fundamentally different from a programming perspective to MobiCycle’s “I have a toaster.” The tester’s “Alexa, ask electronic advisor that I want to sell” is not programmed into MobiCycle’s skill. We coded for “I want to sell.” But any competent developer should know these points. It is almost as if the tester wants to fail the skill….

Moreover, it is highly unusual for test results to combine multiple scenarios into one submission. Each phrase should be judged on its own merits and summarised accordingly. Combining multiple sentences into one failure report is unhelpful. Here I outline in more detail where the tester went wrong. We begin with the first phrase.

User: "Alexa, launch/start/open electronic advisor"

This phrase is so basic it could be considered Alexa 101. If someone says, “Alexa, launch Electronic Advisor,” the skill launches successfully. To suggest the skill would not launch not only contradicts what every previous tester concedes — that the skill will launch. It goes against Alexa’s basic functionality. Therefore, to fail E-Advisor for this reason borders on being disingenuous.

In the second phrase, the tester goes rogue to create their own sample phrase. They do not use MobiCycle’s actual sample phrase of “I want to sell.” Instead, the testers says,

User: "Alexa, ask electronic advisor that I want to sell"

What does this statement mean, you wonder? And if you are even slightly confused, what chance does Electronic Advisor have of understanding the user’s intentions? I would offer that Electronic Advisor should fail with this tester’s phrase because it does not represent standard English.

As outlined in my previous post, Amazon’s certification team members have a tendency to submit test phrases that are neither grammatically correct nor user friendly. It is no wonder Electronic Advisor fails under the Amazon tester’s guidance.

This point is so important it bears repeating. MobiCycle would never instruct our users to say “Ask Electronic Advisor that I want to sell.” We provide the phrase, “I want to sell.”

The tester’s final phrase states,

User: "Alexa, tell electronic advisor that  I have a toaster"

Although the tester’s ‘toaster’ phrase is grammatically correct, it is not our sample phrase; we have not coded in a response for their phrase; and, therefore it should not be tested.

Furthermore, Amazon allow just three slots for sample phrases. We chose our three phrases carefully. Testers who go off script should not have the final word on the skill’s future.

There are particular reasons why our phrases are the way they are. MobiCycle’s prompts guide the user along a particular journey:

Step 1: Install the skill, "Alexa, launch Electronic Advisor"
Step 2: Launch the conversation with "I have a toaster." Receive the reply, "What is the name of the manufacturer of your electronic or electrical item?"
Step 3: Tell Electronic Advisor what you hope to accomplish, "I want to sell". Begin the sales script.

At this point, you may wonder what Amazon’s official documentation says. Amazon does provide guidance to developers. Per Amazon, all three sample phrases need not be launch phrases. Take, for example, Amazon’s sample phrase,

* Additional phrase example: "How do I make an egg sandwich?"

Per their guidance and in direct contradiction to the tester, Amazon include secondary phrases that lack a launch request. These phrases go beyond the launch to ask how to do something. In a similar vein, Electronic Advisor’s second and third sample phrases signal something needs to be done.

Sample phrases that lack a launch request represent the user’s intentions and are referred to as “intents.”Amazon defines intents as “an array that specifies the list of intents that can be sent to the skill.” MobiCycle have built Electronic Advisor with several intents such as the action intent and the electronic intent.

The ActionIntent is pretty straightforward in Electronic Advisor. It is programmed to listen for the following phrase, “I want to {ACTION}.” The {ACTION} is any action from a list of actions defined by MobiCycle. To “sell” is one of the available actions. So, if anyone says to Electronic Advisor, “I want to{SELL},” Electronic Advisor knows that you want to sell an electronic or electrical and takes you down the sales path. It really is just that straightforward. Note: Our code does not have, “Alexa, tell {THE SKILL} that I want to {ACTION}.

To invoke the ElectronicIntent, a user might say, “I have a(n) {ELECTRONIC},” where {ELECTRONIC} represents a list of items MobiCycle have previously defined. So, when a user says “I have a {TOASTER},” Electronic Advisor recognises this phrase as a trigger for the ElectronicIntent. Electronic Advisor responds to the triggering of the ElectronicIntent with the question, “What is the name of your manufacturer?”

In conclusion, Amazon’s tester should not have failed Electronic Advisor. The tester and certification manager compounded their error by pulling Electronic Advisor from circulation. If you are a supervisor for Alexa, please reconsider the certification tester’s position for the following reasons.

Electronic Advisor launches with the launch phrase. It should not have to perform a custom request at the same time it launches. Some skill developers may choose to launch their skill while executing an action, but MobiCycle’s skill should not fail for not combining the launch phrase and action.

Moreover, the tester made no attempt to warn us before they pulled Electronic Advisor, nor did they offer or explore other options. They did not mention an easy solution to this dilemma. The tester would know that developers are not required to submit three sample phrases. One will suffice. The two sample phrases that seem to repeatedly confound Amazon’s certification team could be deleted….. If only it were so simple.

Amazon’s certification department adds insult to injury. They pulled Electronic Advisor when their own system regularly fails to update. A routine task of deleting two sample phrases fails repeatedly. This error has nothing to do with MobiCycle. Amazon’s Alexa interface errors out regularly and goes down for hours, if not days at a time.

As of this blog post, we are unable to resubmit Electronic Advisor for the UK because of a “validation error for locale German (Germany).” We have not changed the code in Germany so there should not be any problems with the German skill. Yet, we are delayed (effectively punished) for circumstances outside of our control. We now have no public access to run tests on our skill in multiple locations. So, what will we do next? The tester’s email closes with the following ‘advice.’

Please do not reply to this e-mail. To share specific feedback or receive additional clarity on your skill's certification results, please use our contact form here. Please note that you will be directed to a login page before submitting your feedback. Providing your skill's name and application id will assist us in helping you as quickly as possible.

So, we are referred to the same communication channel to which I have spent the past four months unsuccessfully trying to reach a manager. This channel is also the same one that told me they ‘could not help me any further’. I am beginning to think I would be better off taking my chances with today’s artificial brains, however immature.

