Hey Google! Talk to…

Create your own voice command for the Google Assistant with Python, Dialogflow and Google Cloud Functions (serverless fun!) [Part 1]

A couple of months ago, my dad bought me a Google Home Pod. Skeptical at first about its usefulness, it has become weaved into my daily morning routine.

When I wake up, one of the first things I do is yell at it.

“Hey Google! Play the latest news from APM Marketplace!”

My morning vocal workout

After accosting the home pod for a few weeks, I became intrigued about its design and how to integrate my own command. As software developers, we usually ask the question “How does it work?” So, I set out to figure out a little piece of the puzzle.

In this tutorial we will dive into Python webhooks, Dialogflow, and Google Cloud Functions aka serverless functions. At the end, we will have an Alpha version of our voice command deployed.


How does this all fit together so I can yell things to my poor Google Home Pod?

Well…

Yell at Pod -> Dialogflow calls your webhook -> webhook returns response -> response is delivered to pod

Our goal will be to develop an action that will tell us a Dad Joke via the Google Assistant.

We will touch on working with Apprentice, setting up Dialogflow, Deploying with the Google Assistant console, and deploying a Google Cloud (serverless) function.

Pre-Requisites

You need to have certain things set up to get going. I will make this guide as beginner friendly as possible, but I won’t reinvent the wheel when it comes to describing how to set up virtualenvs, performing API calls, etc.… Therefore, some Google-fu may be necessary on your part in the case you don’t understand some concept. Or leave a comment and I’ll try my best to expand and explain further.

You will need to have:

In addition, I strongly recommend you have a Python virtual environment set up for this project.

Step 1: Initializing the Project

We will be using a package I am developing called Apprentice.

Apprentice will take care of project initialization, local development, and basic Dialogflow API 2.0 responses.

Once we have our local environment setup, you can

pip install apprentice

Then we can run:

apprentice init

This will create:

  • main.py — Our webhook logic goes here
  • requirements.py — Requirements necessary for production

Note: If you are planning on using the gcloud cli, it is necessary to keep these filenames. The cli will look for the main.py and requirements.txt files. If not there, it can cause issues.

Apprentice uses Flask under the hood. This is due to the fact that the Cloud Function platform uses it for the Python 3.7 runtime. As a result, we will have to

export FLASK_APP=main.py

to get a local development server to run.

This server is useful because it allows rapid iteration on our webhook function.

Step 2: Running locally

Run the development server with

apprentice run
The local dev server outputs something like this

A requirement of the Dialogflow webhook is the endpoint must be https. We can use ngrok to expose localhost to the web with https.

./ngrok http 5000

Ngrok will output a UI in the terminal with the url. Note the forwarding https url, because we will need it soon.

Step 3: Dialogflow Webhook

Our local code should be running properly, so we can move toward setting up Dialogflow to respond to our vocal/text queries.

Log into Dialogflow and add an action, “Dad Jokes”.

It’s not required, but easiest if you also create the Google Project here too. You can do that via the section Google Project.

We will be directed to the main dashboard for our action. On the left side, you will see a couple tabs, navigate to Fulfillment. By default, our action is not configured for a webhook. To change that we will add the ngrok tunnel https url from earlier.

We don’t need any headers or authentication at the moment. Go to the bottom and click Save.

Step 4: Entities and Intents

Although I won’t go over them too much here, Intents and Entities are quite core to a Dialogflow action. In short, Intents are what the user is generally trying to convey. And entities are the specific data that make that intent unique or personal.

For a user can ask,

“What is the weather in Valencia, Spain?”

The Intent in this case would be, “What is the weather”. And the Entity would be “Valencia, Spain.”

Valencia, Spain is something we can query against a question that we want to answer.

Nevertheless, these are predefined by us in Dialogflow.

In the Entities tab, define an entity joke. Then you will have to define a row called joke and Save.

In the Intents section we are going to add an Intent called get-joke. Add a Training phrase with a user expression defined as tell me a joke. Finally, enable the webhook fulfillment, and click Save.

Step 5: Testing

Once all that is setup, we are ready to test it out!

On the right-hand side of the Dialogflow dashboard, we will find a “See how it works in Google Assistant”. Click on the link.

The new dashboard will have something that resembles a phone. In this space, we can directly test out all of our Intents, Entities, and webhook behaviors.

Give it a shot with invoking the action with “Talk to my test app” followed by the Intent we defined before, “tell me a joke”.

If all goes well you should see “Hello World”.

This means that everything worked perfectly. We can even prove the result by looking at the ngrok UI which will show all the http calls coming in. You should see a 200 response.

Don’t pay attention to the 500 :(

Additionally, you should see a call in the local Flask server.

Note that if something goes unexpectedly, you can debug from the stack trace found in the local Flask server. Also, if you try talking to your test app and it disconnects, that usually means that an error was raised in your webhook. Check the stack trace, fix the issue and try talking to your test app again. That’s iteration!

Conclusion

That’s it for the “Hello World” portion of this tutorial. We learned how to initialize a project with Apprentice, get a local development server running, start a Dialogflow project, and defined Intents and Entities for our action.

The next steps will be implementing the desired API call so we can return a joke to the user.