Building Actions on Google: An Overview

Mai Truong
Atomic Robot

--

In the first blog post of the series, we learned about the Google Assistant, Actions on Google, and how they can benefit users as well as app developers. In this second article, we will discuss the technical details developers should consider when building conversations with the Google Assistant. Specifically, we are going to talk about two approaches: Dialogflow and Actions SDK.

Dialogflow

Dialogflow (formerly API.AI) is a platform that lets developers build conversational interfaces (voice apps or chatbots) so brands can connect with users on websites, mobile apps, Google Assistant, Facebook Messenger, other platforms, and devices in multiple languages. Dialogflow is powered by Google Machine Learning, runs on Google Cloud Platform, and is the most widely used tool to build Actions for the Google Assistant.

To create Actions on Google apps using Dialogflow, developers first need a Dialogflow agent or take advantage of prebuilt agents Google provides. An agent contains, among other things, all the built-in or developer-defined intents that can transform user input to match those intents. Intents are assigned as entry points into the app because they are mappings between user’s requests and actions taken by the app. Note that an “action” means an activity that an Actions on Google app performs.

Developers can create the following types of actions:

  • Default Actions: Every Dialogflow agent must have only one welcome action to be invoked with an explicit intent — when users ask for the app by name. For example, “Ok, Google. Talk to My Meal Planner.”
  • Additional actions to deep link into an app: Actions triggered when users invoke an app with implicit intents — with its name and an action, or just an action. For example, “Ok, Google. Talk to My Meal Planner … I want to add pepperoni pizza for tomorrow’s breakfast.”

Once actions are defined, developers can build conversations for those actions by specifying what users need to say to trigger an intent and the corresponding responses or possibly a fulfillment request. The fulfillment request can be hosted on any cloud platform that supports HTTPS requests and responses. Once the Dialogflow agent is created and fulfillment services are deployed, the app is ready to be submitted to become available on more than 500 million devices with Google Assistant.

For example, if an app wants to ask a user their favorite color and say something about it, it needs to parse the variations in inputs, extract a color name, and use that color in the response. With Dialogflow, it is as easy as creating a Favorite Color intent, providing different sentences containing a color as training phrases, creating prompts in case user forgot to say a color, and using a parsed$color variable in the responses:

Training phrases for user input on their favorite color and responses with $color variable

Visit the documentation to learn more about building conversations with Dialogflow.

Actions SDK

Instead of building Dialogflow Agents, with the Actions SDK developers declare actions in a JSON file called an action package, which is uploaded to the developer’s project. Developers can create actions in an action package by mapping an intent to the fulfillment that processes the intent. Same as with Dialogflow, there are two types of actions: Default actions and additional actions that deep-link into the app. The default action must associate with a actions.intent.MAIN intent.

For example, if a shoe store has an app called My Shoe Store where users can make orders, check order status, and ask about daily deals, intents can be triggered by the user saying:

  • “Ok Google, talk to My Shoe Store.”
  • “Ok Google, talk to My Shoe Store to buy some shoes.”
  • “Ok Google, talk to My Shoe Store to check my order.”
  • “Ok Google, talk to My Shoe Store to show today’s deals.”

The action package might look something like this:

{
"actions": [
{
"name": "MAIN",
"intent": {
"name": "actions.intent.MAIN"
},
"fulfillment": {
"conversationName": "sekai-app"
}
},
{
"name": "BUY",
"intent": {
"name": "com.example.sekai.BUY",
"parameters": [{
"name": "color",
"type": "SchemaOrg_Color"
}],
"trigger": {
"queryPatterns": [
"find some $SchemaOrg_Color:color sneakers",
"buy some blue suede shoes",
"get running shoes"
]
}
},
"fulfillment": {
"conversationName": "sekai-app"
}
},
{
"name": "ORDER_STATUS",
"intent": {
"name": "com.example.sekai.ORDER_STATUS",
"trigger": {
"queryPatterns": [
"check on my order",
"see order updates",
"check where my order is"
]
}
},
"fulfillment": {
"conversationName": "sekai-app"
}
},
{
"name": "DAILY_DEALS",
"intent": {
"name": "com.example.sekai.DAILY_DEALS",
"trigger": {
"queryPatterns": [
"hear about daily deals",
"buying some daily deals",
get today's deals"
]
}
},
"fulfillment": {
"conversationName": "sekai-app"
}
}
],
"conversations": {
"sekai-app": {
"name": "sekai-app",
"url": "https://sekai.example.com/sekai-app"
}
}
}

The rest of the steps is similar to that of Dialogflow. To see more details, visit the guide.

Pros and Cons

Actions SDK has the following advantages:

  • All the actions are defined in a single file, which is an excellent choice if the app can handle multiple simple requests that can be completed on the first try. My Shoe Store is one example.
  • Developers can add more complex features with the action package. For instance, linking account, granting authorization, and more.

Moreover, the following disadvantages:

  • Actions SDK only gives access to user input, so it is up to app developers to provide the backend process the input and generate appropriate responses.
  • The action package can become overly complicated for large applications. Actions SDK is useful if the app has some natural language processing capabilities.
  • It requires setting up with command line codes, which takes some time can deter some people from trying.

Dialogflow is a wrapper of Actions SDK with additional features such as easy to use IDE, natural language understanding (NLU), and more.

The good things about Dialogflow are:

  • Great user interface that in some cases non-developers can use. The first tutorial in Google Codelabs doesn’t require writing any code to get an app running!
  • It uses AI that extracts parameters from user input that can be used in responses or fulfillment requests.
  • It is intuitive to create follow up responses, or responses that ask for the missing required information. The Favorite Color intent above is an example.
  • The agents are platform-independent, so developers can integrate one agent with multiple platforms using different SDKs and integrations. The flexible integration is helpful if companies want to operate intelligent chatbots on their platforms.

The only downside I see is the need to learn the concept of agents and entities.

When it comes to choosing which platform to build an app, it comes down to the app’s type. If the app is more Q&A oriented and expects to fulfill the user’s request after the first try, then Actions SDK is the ideal tool. On the other hand, if the app is more conversational and needs to handle different user inputs, gather enough information before completing a task, or hold a natural conversation, Dialogflow is the better tool. Future blog posts will talk more about the technical aspects of building an app.

Conclusion

Dialogflow and Actions SDK are methods to create Actions on Google applications. They enable Google Assistant to recommend apps that can fulfill user’s requests so users can get their work done faster and developers can get more people to try out their product. Before making an app for Google Assistant, Developers need to consider the advantages and disadvantages of both approaches to make an informed decision based on their app’s features and requirements.

--

--