Building Google Action with JavaScript

It’s Time to Take Some Action on Google Action (…pun intended 😀)

At Google I/O 2017, the company showed off Actions on Google (or Google Action), and announced a Developer Challenge to create new Actions for Google Assistant on Google Home.
 
Since this is still relatively new tech, I’d like to share a short tutorial with you about how you can create your very own actions. I will use Dialogflow (a.k.a api.ai), a Conversational User Experience Platform, to help us create an agent/bot that we can communicate with in human language (voice or text), and Node.js to build out the webhook/server side.

Our New Action: Math Trainer!

The idea for this action came to me when Google Home joined our family. The kids immediately adopted “her” and started communicating with “her” on a daily bases. Since we homeschool our children, many of the requests to Google Home were educational. I thought that if I could combine the fun the kids had playing with the Google Home with a way for them to practice what they were learning, it would be perfect. Thus the beginning of Math Trainer. 
 
I believe that every product should have a clear goal to help define its development. So, the goal for Math Trainer is to create a Google Action that encourages children to practice math.

With that in mind, we can get started:

Defining our Action

The architecture of this solution based on using Dialogflow to process the human natural language (input) and convert it to our server as structured data (output). The server (Node.js) will then decide how to continue the conversation and reply with the right response. 
 
Since we are dealing with conversation, let’s begin by defining the conversation flows and scenarios:

  • Welcome greeting: When a user asks from Google Assistant to talk to an agent (our “action bot”), the agent will need to start the conversation. It would be good to provide the user with a short introduction and explain what the agent is all about and kick off the conversation. Our Math Trainer will start with a “Hello,” and present the first question.
  • Goodbye greeting: Users can only speak to one agent at a time, so when the user wants to end the conversation, it will return to Google Assistant. It would be nice to say goodbye to the user and perhaps encourage the user to return in the future.
  • User provides answer: For now I will split this scenario to 2 options:
    1. Correct answer: The agent will respond with a positive reaction and provide the next question so the conversation can continue. 
    2. Wrong answer: The agent will respond with a negative reaction and ask the user to try again so the conversation can continue.
  • Default Fallback Intent: When Dialogflow can’t matched the user input to any of the scenario, it will send it to the server as Default Fallback. This will handle the case where the user lost in the conversation and needs to be set back on track.

Later, I will convert these scenarios to Intents, with is the way to set them up in Dialogflow.

Now that we have defined our product, let’s set it up and write some code.

Building the Math Trainer

We’ll start by creating two accounts, one on https://console.actions.google.com, and the other at https://console.dialogflow.com. The Google account will allow us to create actions, test and submit them to Google Assistant, while the Dialogflow account will enable us to create an agent that will translate human language to a data structure that our server can handle.

Once we’ve got out accounts set up, we can start the process:

  1. In actions.google, create a new project. Give it a name and click the Create Project button.
  2. On “Add actions to your app”, chose Dialogflow.
  3. Click on Create actions on Dialogflow button to open a new tab with Dialogflow console.
  4. There you can add details if you like, and click Save.

Creating Intents

Once the action is created on Dialogflow, the console will take you to intents window. This is the heart of your application (the soul will be in the JS code). Intents will gather input from the user in human language and translate it into actions that code can handle.

Start by deleting the “Default Welcome Intent” intent, and create a new one called “math_trainer” (You can see the “Create Intent” button on the top).

  • In Contexts, add “quiz” to the output contexts. Contexts help us understand in which state we are in the conversation.
  • In User says, we will add the text we expect to get from the user in order to start the action. For now, just add the sentence: “Math trainer”. I’ll explain more about User says in the next intent.
  • In Events, add “GOOGLE_ASSISTANT_WELCOME” so the google assistant will know that this is the first intent to start when referring a user to the app.
  • In Action, write “generate_question”. This will be the action the code will use for this intent.

Now it’s time to save the intent, using top right Save button. (The Save button is something you always need to remember while working in Dialogflow).

Congratulations! You just made your first intent! Now we just need to create some more in order to handle a full conversation.

The next intent will be called “quit_trainer” and you can guess what it will do.

  • In Contexts add “quiz” to the input and output Contexts.
  • User says will be the following:
    I give up
    Stop
    Quit
    Here we are simply giving a few examples of what the user might say in case he/she would like to end the conversation. Dialogflow will use its machine learning to “translate” more sentences into this intent.
  • The Action will be “quit

So that takes care of that case (don’t forget to hit “Save”!), so now we need just one more intent before we can dive into some actual coding.

Next, we’ll create an intent called “provide_answer”.

  • Once again add “quiz” to the input and output Contexts.
  • (We will fill the User says in a moment)
  • In the Action section, write “check_answer” in the action name and fill the first row in the table below with:
    REQUIRED: need to be checked.
    PARAMETER NAME: answer
    ENTITY: @sys.number
    VALUE: $answer

Now, let’s go back to User says and add the line “it’s 45”. Once you hit enter, the line will be added to the list, the platform will mark “45” in yellow, and a new popup line will appear and show you that it understands that that the value is 45.

This is the real magic of Dialogflow: it can listen to human natural language and convert it to structured data that we can then play with in code.

Let’s give it some more examples. Add the following lines:
I guess it’s 12
27
My answer is 42

 
Note: If for some reason the system didn’t mark the number, you can do it manually by marking the number with the mouse and choosing @sys.number:answer from the popup list.

Again, don’t forget to save your new Intent.

Our completed provide_answer intent

We actually have one more intent in the list, the “Default Fallback Intent,” which handles all the “lost in the conversation” event, but since we get this intent from the system, we don’t need to do anything with it (for now).

Writing the “Soul” of the Application (Coding in JS)

So now that we have an entry intent, an exit intent and an answer intent, we can start writing some code.

For the Webhook, I will use Node.js, with only 2 files: an index.js and a package.json. You can download them from github: bit.ly/MathTrainerV1.

(Note: an explanation about the package.json is not really in the scope of this post, but if you’d like me to explain it in more detail, let me know in the comments.)

Now let’s get to work on our main file: index.js. We’ll start with some environment declarations:

'use strict';
process.env.DEBUG = 'actions-on-google:*';
let ApiAiAssistant = require('actions-on-google').ApiAiAssistant;
let sprintf = require('sprintf-js').sprintf;

Set the constants:

// The context and actions as we declare them in api.ai
const QUIZ_CONTEXT = 'quiz';
const GENERATE_QUESTION_ACTION = 'generate_question';
const CHECK_ANSWER_ACTION = 'check_answer';
const QUIT_ACTION = 'quit';
const DEFAULT_FALLBACK_ACTION = 'input.unknown';

And now let’s set up some strings to handle the conversation:

const WRONG_ANSWER = 'No, it\'s not %s. Try again.';
const CORRECT_ANSWER = 'Correct!';
const WELCOME_MESSAGE = 'Welcome to the math trainer!';
const ASK_QUESTION = 'How much is %s';
const QUIT_MESSAGE = 'See you later.';

And finally, let’s set the logic constant:

const min = 0;
const max = 10;

Next, let’s create a function to connect to the Dialogflow:

exports.math_trainer = function (request, response) {
// print the request to the log
//console.log('headers: ' + JSON.stringify(request.headers));
//console.log('body: ' + JSON.stringify(request.body));
const assistant = new ApiAiAssistant({request: request, response: response});

Let’s keep this function open and add in some other functions that will handle the input/output of our action.

To keep is simple, I chose not to store any data of the user or the session. In order to keep track of the conversation, I will send all the data that needed through the assistant. In this case, it will be the question and the answer, so when we get the user answer back, we can compare it to the one in the assistant.data.

To make the Math Trainer able to have a proper “conversation,” we’ll create a function to handle each intent and close the math_trainer function:

function generateQuestion (assistant) {
console.log('generateQuestion');
let newQuestion = getNextQuestion();
assistant.data.answer = newQuestion.answer;
assistant.data.question = newQuestion.question;
assistant.setContext(QUIZ_CONTEXT);
assistant.ask(printf(WELCOME_MESSAGE + ' ' +
ASK_QUESTION, newQuestion.question));
}
function checkAnswer (assistant) {
console.log('checkAnswer');
let answer = assistant.data.answer;
//let question = assistant.data.question;
let userAnswer = assistant.getArgument("number")? parseInt(assistant.getArgument("number")): '';
if (answer! = userAnswer) {
assistant.ask( printf(WRONG_ANSWER, userAnswer));
} else {
let newQuestion = getNextQuestion();
assistant.data.answer = newQuestion.answer;
assistant.data.question = newQuestion.question;
assistant.ask(printf(CORRECT_ANSWER + ' ' + ASK_QUESTION, newQuestion.question));
}
}
function defaultFallback (assistant) {
console.log('defaultFallback: ' + assistant.data.fallbackCount);
if (assistant.data.fallbackCount === undefined) {
assistant.data.fallbackCount = 0;
}
assistant.data.fallbackCount++;
assistant.ask("WHAT? I asked how much is "+ assistant.data.question);
}
/**
* Use Tell to send goodbye message and close the mic
* @param assistant
*/
function quit (assistant) {
console.log('quit');
assistant.tell(QUIT_MESSAGE);
}
/**
* Use sprintf to reformat the string and add params to it
* @param line
* @returns {*}
*/
function printf(line) {
console.log('printf: ' + line);
return sprintf.apply(this, arguments);
}
// Map all the actions that create on api.ai to the function in this file
let actionsMap = new Map();
actionsMap.set(GENERATE_QUESTION_ACTION, generateQuestion);
actionsMap.set(CHECK_ANSWER_ACTION, checkAnswer);
actionsMap.set(DEFAULT_FALLBACK_ACTION, defaultFallback);
actionsMap.set(QUIT_ACTION, quit);
assistant.handleRequest(actionsMap);
};

Finally, we only need one more function to generate the questions:

/**
* Randomize the next question
* @returns {{answer: number, question: string}}
*/
function getNextQuestion (){
let value1 = Math.floor(Math.random() * (max - min + 1)) + min;
let value2 = Math.floor(Math.random() * (max - min + 1)) + min;
let res = {
answer: (value1 + value2),
question: sprintf(PLUS_QUESTION, value1, value2)
};
console.log(JSON.stringify(res));
return res;
}

We are now ready to launch!

Running the Code

Usually I will run the code locally first, but to save time, let’s run the code on a public server.

You will need to run the node.js instance on a public domain with SSL. If you don’t have one handy, you can use Google App Engine. For a tutorial on how to do this, see: https://cloud.google.com/functions/docs/tutorials/http.

Once you have a working URL with the code running on it, go to your project in Dialogflow and select Fulfillment. Enable Webhook set the URL in the URL field and save it.

Now go to Intents and in each one of them, find the Fulfillment section at the bottom and enable Use webhook.

We have now made a connection between Dialogflow to our Node.js server, so we can move on to setting up the connection between Dialogflow and Actions on Google

Open the Integrations section within the Dialogflow interface and click on Actions on Google. In the setting window, check that “Welcome Intent” is set to “math_trainer” and click Update.

Use the Visit Console button to go back to console.actions.google.com.

Fill up all the App information the system asks for. Then on Surface capabilities set all questions to “No”.

Click Test to make sure it’s doing what we want it to, then follow the instruction on the Actions Simulator and test your new action.

Congratulations, you have just created a new action!

But if you play with it a bit, you will agree with me that it does not really meet the definition of ״Math Trainer is a Google Action that encourages children to practice math.״ It’s boring and does not create a pleasant experience for conversation. Repeatedly stating the same sentences will not make anyone want to continue the conversation, especially not a child.

So now we need to design the agent personalty and design the agent conversation to match a human, but we’ll work on all of that in the next post.


In case you were curious, my inspiration for this article came from Google article about number-genie action. You can try it by saying, “Talk to number genie” to Google Home.