Announcing the Java & Kotlin client library for Actions on Google

We have supported a Node.js client library for Actions on Google since the platform first launched. We learned a lot from developers who used the library. We also heard loud and clear that you wanted to see better support for building Actions for consumer and enterprise use cases with the Java programming language. Today, we are excited to announce the Java client library for Actions on Google

Understand how Actions work

Actions on Google supports a JSON based protocol between the Assistant and your fulfillment. To participate in conversations with Actions on Google, your Action implements a fulfillment webhook that can respond to HTTP requests from Actions on Google.

Figure 1: Diagram shows the interaction between various sub-systems that participate in a voice conversation with the Google Assistant. The part shaded in orange is the Action fulfillment webhook hosted on the Google Cloud Platform.

Figure 1 shows a sample interaction between the end user, the Assistant, Dialogflow, and an Action fulfillment. To summarize, the Assistant converts the voice input from the user to text. Next, the system must understand what was said. Dialogflow uses machine learning to accurately map the text to the intent of the user and extract entities (such as places, things, date/time, etc). Dialogflow invokes the Action fulfillment webhook with a JSON payload that contains all the information it extracted from the text. The Action fulfillment (highlighted in orange in the figure) handles the JSON request and implements the logic to respond back to the Assistant (or Dialogflow). The Assistant on the device converts the response to voice and display (on screen devices).

Introducing the Java/Kotlin client library

As explained in the previous section, an Action needs to implement a fulfillment webhook that can receive JSON via an HTTP POST request from the Assistant and respond back with JSON as per the Actions on Google JSON specification. To implement such a webhook with the Java programming language, a developer can follow these steps:

  1. Implement a publicly reachable cloud endpoint: Implement a Servlet for Google Cloud Platform or similar cloud solution to receive the JSON payload via HTTP POST.
  2. Parse the request: Parse information in the JSON body into Java objects.
  3. Route the request: Extract the matched intent from the request JSON and execute the corresponding business logic.
  4. Assemble a response for the user.
  5. Serialize the response into JSON and send it as a response to the HTTP POST received from Google.

The above tasks are common to all Actions. We want to make it easier to build Actions by supporting most of the common actions in the library so you can focus on the most important task — your Action’s logic. The Java/Kotlin library implements all the above features to provide a first-class API while abstracting you from the inner workings of the JSON protocol. Let’s see how this is achieved.

Boilerplate code for your Cloud endpoint

Cloud platforms require you to implement specific classes to handle the HTTP requests. Examples include HttpServlet in the case of Google App Engine and RequestStreamHandler for AWS Lambda. If you are using the Spring framework, you may need to implement the appropriate annotations in your handler class.

To help you get started quickly, we have provided boilerplate code that implements this plumbing. We recommend that you start your project by cloning this boilerplate code. The boilerplate code offers the simplest and quickest way to implement a webhook for your Action by reading the JSON from a HTTP request and delegating it to your App object.

Our boilerplate code currently supports Google Cloud Platform and AWS Lambda as entry points. If your webhook is hosted in a different cloud platform, you should be able to easily adapt the boilerplate code for your needs. Similarly, it should be easy to modify the boilerplate code for Spring and/or Kotlin.

Request processing

The library provides a top-level interface — App.handleRequest() — to handle requests from the Assistant. DefaultApp, a subclass of App, implements the request processing logic by first parsing the JSON into Java classes. To make this easier and less error-prone, we have provided Java classes (POJOs) that map 1:1 to the concepts defined in the JSON protocol. We call these “binding classes”. The strong typing provided by these binding classes allows IDEs such as IntelliJ and Visual Studio Code to auto-suggest class and method names within the code editor.

Once a request is successfully parsed into binding classes, DefaultApp extracts the intent name from the request and uses this to route the request to the specific method. The library implements routing elegantly for you through the @ForIntent Java annotation. As a developer, all you need to do is to implement your Action’s logic in intent handlers, which are explained in the next section.

Intent handler

An intent is a goal or action that the user wants to do, such as listening to a song or ordering coffee. Actions on Google represents the intent as a unique identifier. Your Action webhook provides handlers for intents it wants to handle dynamically. In the Java client library, this is implemented as a Java class that extends either DialogflowApp or ActionsSdkApp. Intent handlers are public methods in this class that are marked with a special annotation — @ForIntent as shown below. Intent handlers accept an ActionRequest object as a parameter and return an ActionResponse object.

In the above code, the method responds to the “Default Welcome Intent”, which is the name of the (case-sensitive) intent as defined in Dialogflow. The client library extracts the intent name from the JSON request and routes the request to the specific handler based on the @ForIntent annotation value.

Figure 2: The intent page in Dialogflow. In this case, intent name is “Default Welcome Intent”.

Assemble a response

Intent handlers return a relevant fulfillment response that is sent back to the Assistant, which ultimately conveys it to the user as voice and/or visual response. In the simplest form, a response is text spoken back to the user. Actions on Google also supports many other response formats, which include immersive cards with images, carousels, lists, media, and SSML. Your Action may also respond with one of the helper intents supported by the Assistant. Examples include requesting confirmation from the user or getting permission from the user to get their location.

The ResponseBuilder class provides a variety of helper methods to assemble a response. In the simplest case, your Action responds back with text:

Here is a response that uses BasicCard to render a visual response:

The client library generates a JSON response from the ActionResponse object returned from the intent handler (see below). The JSON is eventually handled by the Google Assistant to render an audio/visual response to the end user.

Figure 3: Assistant response and JSON response for Basic card.

The following response uses a helper intent to request the Assistant to get the relevant information from the user:

You can see more examples of using helper methods in the API reference for ResponseBuilder and in the samples.

Serialize response to JSON

The library also handles the serialization of the response from Java objects to JSON as per the JSON specification and sends it as the response of the HTTP POST request.

Kotlin

We built the client library entirely in Kotlin. We found Kotlin to be an expressive language with many features that allow you to implement your Actions with safer code that is easier to maintain. Creating a RESTful Web Service with Spring boot is a good resource to help you get started with Kotlin and Spring.

Next steps

As you can see from the above examples, the Java library provides an intuitive API to assemble responses from your Action. It provides an idiomatic abstraction over the JSON protocol to make it very easy to assemble all supported responses from your Action.

We are excited to see what you build with the Java library for Actions on Google.

Here are useful links to explore and learn more about the library:

Library on GitHub

Reference docs

Boilerplate

Java Samples:

Conversation components

Helper intents

Updates

Transactions

If you encounter issues in the library or would like to request a feature, please file an issue on our GitHub page or let us know on Stack Overflow.

Want more? Head over to the Actions on Google community to discuss Actions with other developers. Join the Actions on Google developer community program and you could earn a $200 monthly Google Cloud credit and a Google Assistant t-shirt when you publish your first Action.

Java™ is a registered trademark of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Tags: Java, Kotlin, Actions on Google