Look inside an Alexa Skill with Thundra

Alexa is Amazon’s virtual assistant and the brain behind tens of millions of Echo devices like the Echo Show and Echo Spot. Alexa provides capabilities, (called skills), that enable customers to experience more personalized service. The Alexa Skills Store currently has more than 45,000 published skills and this number is increasing rapidly.

Amazon recommends using AWS Lambda to develop the backend APIs for custom skills. So right now, the fastest way to publish your cool custom skill idea is to leverage serverless technologies! This is great news for us at Thundra because we can have some fun showing you how you can use Thundra to monitor and observe your Alexa Skills.

In fact, today I’m going to share how I developed a custom Alexa Skill with the Node.js Alexa Skill Kit (ASK). I then used Thundra to manually instrument my Lambdas, allowing me to understand what happens while during function brings me the information about Ice and Fire.

Manual instrumentation is useful when you want to examine any specific piece of code that you suspect is causing problems. It’s a very flexible and straightforward approach as you can simply add a custom variable wherever you need to start and finish a span.

Let’s dive in to see how it helps to identify the flow throughout the life of a Alexa query.

Yet Another Game Of Thrones Wiki

Designing a natural Voice User Interface (VUI) is a difficult task as it requires mastery of your target language, so I thought I should make my demo app simple. To do that, I decided to make a simple Alexa Skill that answers some fundamental questions about a TV show. I searched for public APIs for some of my favorite TV Shows and I found “An API of Ice And Fire,”. This API is described as, “The world’s greatest source for quantified and structured data from the universe of Ice and Fire.” Perhaps building a simple Game of Thrones wiki is not the most original idea, but I think it is a good starter project.

To design a Voice User Interface (VUI) for Alexa, you need to map the user’s spoken input to intents. An intent represents an action that fulfills a user’s spoken request. Intents can contain sample utterances and optionally have arguments called slots. I will explain these concepts with examples but you can also check out the Interaction Model docs for more details.

Game of Thrones Wiki will have 3 basic intents:

  • `WhoPlayedCharacterIntent`: This intent answers who plays a certain character on the TV show
  • `BookInfoIntent`: This intent gives information about the Fire and Ice book series
  • `HouseOverlordIntent`. This intents answers who the current overlord of a house is.

For each intent, we need to provide sample utterances and define slots for each utterance.

For example, if the user says ‘Who played Arya Stark?’, Alexa will match speech input to `WhoPlayedCharacterIntent` and match the `Arya Stark` with the slot `character`, because `WhoPlayedCharacterIntent` contains a sample utterance `who played {character}`. Once intent creation was complete, my console looked like below.

Developer Console for Custom Alexa Skills

Trace your Lambda with Thundra

According to the 2018 Serverless Community Survey, the top 3 challenges with serverless development are debugging, monitoring, and testing. So, during development, I added Thundra’s manual instrumentation and log support to my Lambdas to debug my application.

Let’s start writing our Lambda function for the backend API and integrate it with Thundra.

First, initialize an empty Node.js project and grab the Alexa Skill Kit, Thundra core, and Thundra log plugins from the NPM registry with the following command.

Before writing our intent handlers, import the Alexa Skill Kit SDK and the Thundra agent. With the `skillBuilder`, build your Lambda function and wrap it with the Thundra agent. You don’t need to worry about `ErrorHandler` and `WhoPlayedCharacterIntentHandler`right now.

Now we can write our first intent handler and add trace information. The Thundra agent is OpenTracing compliant. We can use the OpenTracing API and the Thundra tracer to measure execution times for specific pieces of code.

To write an `IntentHandler`, we need to implement two functions:

  • First one is `canHandle()` function. In the Alexa Skill Kit SDK, the request routing is based on a “Can you handle this?” concept, which means that every handler lays out the specific condition(s) that it is capable of handling and returns a `boolean` response to requests. You can set these conditions within the `canHandle()` function defined inside the handler. The `canHandle()` function is required in every handler.
  • The second function we need to implement is the `handle` function. If the condition(s) set in the `canHandle() ` function returns `true`, the code included in the `handle()` function is executed.

In order to instrument `WhoPlayedCharacterIntentHandler` with Thundra monitoring, we do the following:

  • At the beginning of `handle()` function of `WhoPlayedCharacterIntentHandler`, we grab `ThundraTracer`
  • Then, start a span with name `’WhoPlayedCharacterIntent’`
  • Make our API call
  • Add `speechText` as a tag to the span before the return of `handle()` function
  • Finally, we finish the span.

Errors can happen while coding, so it is important how we handle and report these errors. You can add a `ErrorHandler` function to the Alexa Skill Kit SDK which is a fallback mechanism for unhandled intents and errors. If any of the intent handlers throws an error, the error is caught by `ErrorHandler`. In the error handler, we log the exceptions to both CloudWatch and the Thundra log plugin as well as the current active span.

Now, we need to deploy our Lambda function and integrate with Alexa. To deploy your Lambda function, you can use the serverless template on the thundra-examples-lambda-nodejs repo. You will need to make necessary changes in the template and before deploying your function with The Serverless Framework.

As you can see below, the `WhoPlayedCharacterIntentHandler` works as expected. Let’s look at Thundra to figure out what happened during the invocation.

Alexa Simulator for Testing

On the detailed invocation page of the Thundra Web Console, we can see metrics about our function also we can directly jump to CloudWatch logs. Most importantly, we can look at the spans that we created manually inside the code. As you can see, the first span is the Lambda invocation and the second span is `WhoPlayedCharacterIntent`, and then our intent makes an `API Call` which takes the longest time. Additionally, when you click each span, you can see the tags we added inside the code. You can see that the tag `speechText` is the return value for our skill. Adding manual instrumentation to code during development helps developer to detect any problems earlier.

Thundra Invocation Detail with Manual Instrumentation

My full code is available on GitHub. You can try it yourself and play with it! Just sign up for our free beta and follow the instructions to deploy it into your AWS environment.

I am excited about Thundra’s new Node.js manual instrumentation functionality! I hope it will help you detect, debug, and solve the problems more effectively as well as learn almost anything you need to about the invocation you are analyzing.

Try it out and give us your feedback via our Slack channel! And, as always, we welcome your ideas and requests and the opportunity to learn about your serverless observability use cases so we can continue to develop Thundra in ways which you find valuable.

Want to learn more about Thundra’s other features? Visit our website or try our interactive online demo.