Alexa Development — Some Backend bottlenecks

Daniel Kremerov
3 min readSep 1, 2017

--

The backend is hosted in Amazon’s Serverless infrastructure using AWS Lambda. This post has some NodeJS specifics, but generally, should apply to other languages such as Java or Python. One can use one of the templates for Alexa skills which support development in the browser. However, more sophisticated skills require additional packages, and that requires the upload of a complete zip file to Lambda. This implies that one needs to set up a good development environment that supports quick iterations. As described in the previous blog post, it is crucial to be able to deploy the application quickly, as currently, the dialogue model requires device testing. Therefore, I recommend using Apex, which deploys your changes to Lambda with one terminal command. Note that when using Apex, to use “exports.handle” for expecution, whereas the standard case is “exports.handler” Make sure to have all your packages installed with npm before deploying. Below some standard code to get going.

var Alexa = require('alexa-sdk');
var APP_ID = undefined; // TODO replace with your app ID
exports.handle = (event, context) => {
var alexa = Alexa.handler(event, context);
alexa.APP_ID = APP_ID;
alexa.registerHandlers(handlers);
alexa.execute();
};

var handlers = {
'LaunchRequest': function () {
this.emit(':askWithCard', welcomeOutput, welcomeReprompt,welcomeUsercardTitle,welcomeUsercardContent);},
'Unhandled': function() {
this.emit(':ask', 'There was an error, please repeat what you want to do.');
}
}

Another interesting aspect of development is related to the separation of concerns. On the one hand, most of the Alexa prompts are handled in the dialogue model. On the other hand, so far it is still required to handle prompts at the end of intents in the backend, which by definition is not a clean operation of concerns. The reason is that Alexa expects from the developer to define whether the session continues or closes after the intent. This is done using the ask, and tell words, where the names are a bit misleading as ask is not necessarily a question but every prompt that keeps the skill running. Also, another UI element is handled in the backend, namely skill cards. These are visual prompts that the user can see as support through the Alexa mobile app. In my opinion, it would make more sense to handle this in the skill builder interface for a better separation of concerns. Nevertheless, skill cards are a good mechanism to display the user possible next actions. Besides specific account linking skill cards are used for user authentication, which I introduce in the next post.

Further, an important topic for more sophisticated skills is data persistence. Without a database connection, Alexa can only store data in session variables throughout one session. Given that Alexa is currently a reactive system (e.g. it only replies after the user asks it), many use cases centre around inputting information that is stored and displayed somewhere, which requires a database. A natural choice is to use Amazons DynamoDB as schema-less NoSQL database offers flexibility and one can assume that it works well. From my experience, however, even the Amazon services do not always work as well together as one might expect. Nevertheless, connecting it is quite easy, and a natural identity key is the unique device id of the echo. I recommend setting up a local database copy using Docker and the respective DynamoDB package, as after testing several options, this worked one certainly works best for me. The last posts will describe some limitations

About the author:

I am an entrepreneurially-minded MSc. Computer Science student at University College London. Priorly I studied Business and worked in the consulting and StartUp-sphere. This summer, I have the unique opportunity to dive very deep into the topic of Personal Assistants in Telehealth, fully supported by UCL and NHS Digital UK. I want to give back, so I strive to provide unique insights to my readers, from a technical and non-technical perspective.

--

--