IoT & Google Assistant
Getting started with smart home development, part 1
“Ok Google, set the temperature to 72 degrees and turn off the TV.”
It’s fun to ask the Google Assistant questions and have it, seemingly like magic, return answers to you. It’s a whole different feeling when you can ask the Assistant to do things for you in the physical world. Using the Actions on Google developer platform, you can integrate and control any IoT device through the Assistant.
This is part one of a two-part series, where I cover fundamental concepts and APIs for smart home Actions on Google. These key elements include:
- Natural language understanding (NLU) for IoT
- Device to cloud API communication flow
- Key concepts: device types, device traits, and the home graph
- Incoming calls: SYNC, QUERY, EXECUTE, and DISCONNECT
- Outgoing calls: Report state and request sync
Natural language understanding for IoT
When it comes to IoT with voice control, an important question to ask is: Does it support natural language? With Google Assistant and the Actions on Google developer platform, you define the various capabilities, or “traits,” of your device and the Assistant automatically handles all the different phrases that users can say to control it. In other words, Google gives you natural language understanding out of the box.
For example, you define that your device can turn on/off and dim lights; then, Google handles phrases like “Ok Google, turn on all my lights” and “Hey Google, dim the living room a little bit.” Google abstracts away the NLU and sends a structured JSON request to your webhook.
API communication flow
A critical point of understanding when developing IoT for the Assistant is that Google does not communicate directly with your device. Instead, Google sends device commands and queries to your cloud service, which then handles the direct IoT device communication. The idea here is that you, as the device creator, develop your own cloud service, which includes its own dashboard, device registration, and device management that functions independently of the Assistant. The Assistant interfaces with and augments your cloud service to provide a new voice interface for the user’s devices. This provides you with the freedom and flexibility to implement or re-use your own dashboard UI, device-to-cloud control logic, protocols, and software stacks.
The following diagram illustrates this flow:
The importance of OAuth
Since Google doesn’t directly communicate with your IoT device, it’s important to have OAuth implemented in your cloud service. OAuth is used to link the user’s account from the Assistant with the user’s account on your cloud service and to subsequently share the devices your user owns with the Assistant. To learn more about account linking with the Assistant, see Account Linking with OAuth documentation.
To better visualize this linking process, check out the actual linking flow from within the Home app:
Device types and device traits go hand-in-hand. Both elements provide the Assistant with necessary context about your device in order to control it via voice. Google’s NLU engine uses types, traits, device names, and other information from the Home Graph to provide the correct intent for your IoT cloud service.
A few example device types and suggested traits are listed below (for the full list of device types and traits, see Smart Home Device Types):
Every device is given a device type. Device types are used as suggestions for Google’s NLU engine, provide context for Google to use a specific icon within the Google Home app (for example, a lightbulb for a light device type), and give the developer a list of suggested device traits that Google can control.
Your cloud service defines this device type in the SYNC request. In short, a SYNC request is Google asking your cloud service for a list of all devices that the user owns and controls. In Part 2 of this series, I’ll deep-dive into how the JSON payload is defined in a SYNC request.
Since Google builds and trains its own NLU engine for IoT, there is a fixed set of device types, and we are always adding more types! If you’re looking for a specific device type or trait that is not yet listed, email firstname.lastname@example.org.
Device traits define the actual capabilities of your device, and these traits are what users can actually control. A light can turn on and off (action.devices.traits.OnOff) and a fan can change speed (action.devices.traits.FanSpeed). You can implement some or all suggested traits for your device type.
Traits are used in both QUERY and EXECUTE requests. The QUERY call retrieves the state of the fan speed, while EXECUTE changes the state of the fan speed. In Part 2 of this series, I’ll delve into how the JSON payload is defined for both QUERY and EXECUTE requests.
Again, if you are looking for a specific device type or trait that is not yet listed, email email@example.com.
The Home Graph is Google’s database that houses all the smart home device contextual information. Google stores the entire set of devices that the user controls, along with what room they are in, the name of the device, the device type and traits, and the pushed state of each device. This is how Google knows which device to control when a user says things like:
“Turn off the lights in the kitchen”
- The user has a collection of lights that are associated with the kitchen.
“Turn off the fan.”
- The user may have multiple fans but only one is currently on, so it turns that particular fan off.
“Turn off the vacuum.”
- The user did not specify a name or give a clue as to what room, but there is only one vacuum which can be controlled. Google knows to turn off that particular vacuum.
The Home Graph is used in some way with every call. For example, Google uses it when querying device information (for example, when associating an icon to use for a particular device) or for NLU when executing a command (for example, mapping what the user said to a device trait). Your cloud service, in turn, uses the Home Graph API when pushing device state with the Report State API and SYNC calls.
Wrap-up & TL;DR
The critical takeaways here can be summarized in two main points:
1. Google handles all NLU for you and your IoT device. Google communicates with your cloud service via a single webhook through JSON payloads. Google does not communicate directly with your IoT device; your cloud service is responsible for communicating and maintaining state of the actual device as well as pushing this state to Google.
In part 2 of this series, I’ll cover specific API calls, review the JSON structure of each call, and look at some of our sample code from GitHub.
To learn more about the topics covered in this blog post, check out these links:
- Device types
- Device traits
- Home graph
- Action on Google codelab
- Request new trait or device type: firstname.lastname@example.org
- Account Linking with OAuth documentation