In-Home Serverless Voice Processing

Chad Arimura
Fn Project
Published in
6 min readFeb 13, 2019
Demo Architecture

This post is a guide to getting the Fn Project working with a Google Home device.

Home devices are becoming increasingly pervasive, so for a talk I gave at NIC conference in Oslo last week, I combined a Google Home device with Fn and used Fn Flow to orchestrate a series of functions based on voice commands. All of the function processing happens locally to spite our evil data-hoarding overlords.

The demo is simple. I instruct my usually-loyal robot servant to:

search Giphy for _____” (eg. dogs)

…and it’ll post the Internet’s best memes to a Slack room of your choice.

I published all the code here — so this demo could be your training wheels if you want to automate your entire house using a voice device like Alexa or Home, and handle the processing locally on a machine in your house or on a Raspberry Pi or something. The rest of the process is described in more detail below!

Just make sure to unplug that device when not in use.

Here’s the demo in action:

“Mind blowing” —Hacker News, “OMG Fn is awesome” — Reddit

Prerequisites

  • Google Home device (although you can simulate w/o one)
  • Actions on Google account
  • Fn running on your machine (Start here if you need an Fn primer)
  • Fn Flow running on your machine (Start here for a Flow primer)
  • A path from the outside world to your Fn Server (I use ngrok for this)
  • Slack account (but you can also simulate this by just watching the ngrok logs or switching out the notification to go somewhere else)

Step 1: Create your first Google Home “Action”

  • Create a new Google Home project
  • Select one of the templates. I chose “Business and Finance” because this is serious business.
  • Set up your invocation with a Display Name (this is how you activate your app, ie “Hey Google, talk to Chad’s Home Demo
  • Add your first Action and give it a “Custom intent”.

This opens up Google’s DialogFlow interface where most of the work will be done. Click “Save” to save your new “Agent”. Took me awhile to figure all this out.

Now if you go back to the actions console and refresh the page, you’ll see you have your first Action called actions.intent.MAIN.

Step 2: Set up your Intent

Intents are basically the heart of the interaction between the device and Fn. We’ll set up an Intent that every time it hears a certain phrase, it triggers our Fn function. The response of that function can do many things but we’ll keep it simple.

Go back to the DialogFlow interface, click Intents, and “Create Intent” and give it a name.

Step 3: Create a Training Phrase

  • This is what will trigger this Intent. Put something like “search giffy for dogs” and click Save.
  • This is where it got confusing for me initially. We obviously want our function to search for more than just “dogs”, so let’s make it clear that this last word will be passed into our function as a parameter. To do this, highlight the word dogs and a new menu will pop up. Click “Create new”

Step 4: Create your Entity

  • This will create a new Entity. Call it something like “searchPhrase”, uncheck “Define synonyms”, and make sure to check “Allow automated expansion”
  • Now populate the list with an initial set of things I figure would be nice to handle. This list does not need to be complete because we’re allowing unknown values per the “Allow automated expansion” selection above. Now save that. Here’s what mine looks like:

Step 5: Set up your Parameter

Now go back to your Intent, and expand the “Action and parameters” section. We’ll need to create the parameter that gets passed through to our entity. Call the “Parameter Name” searchPhrase, the “Entity” @searchPhrase (which should pull up a drop-down list), and the “Value” $searchPhrase.

Important note: If all is done correctly, you will see the yellow highlight in a few places to show the linking of the training phrase to the @searchPhrase entity. Here’s what that looks like:

Step 6: Set up Fulfillment to point to our Fn Server

You need to set up Fulfillment in order to incorporate custom logic into your Intents. This is where we set up the Fn Server webhook. I use ngrok to provide a path to my Fn Server running locally, so my fullfillment webhook URL looks like this:

https://mysubdomain.ngrok.io/t/giphyfn/home_endpoint

Put your ngrok URL into the form, and select “Enable webhook for all domains”, and then it’ll look as follows:

Now we need to link our Intent to this fulfillment. Click back on Intents, select your Intent, and at the bottom in the “Fulfillment” section turn on “Enable webhook call for this intent.”

Step 7: Make sure Fn and Fn Flow are running

Fn is easy:

# fn start

Flow is a bit trickier:

# export FNSERVER_IP=$(docker inspect --type container -f '{{.NetworkSettings.IPAddress}}' fnserver)# docker run --rm \
-p 8081:8081 \
-e API_URL="http://$FNSERVER_IP:8080/invoke" \
-e no_proxy=$FNSERVER_IP \
--name flowserver \
fnproject/flow:latest

Step 8: Set up the GiphyFn Application

fn deploy --app giphyfn --all --local
  • Configure your application by modifying the setenv.sh file and enter your own values where applicable, then running that script. (The function IDs you can get using the command fn inspect function giphyfn getgif for example.)
  • Run ngrok so there’s a path into your local Fn Server

Step 9: Test Everything using the Simulator then Publish!

Back in the Actions Console you’ll see a section called “Test” with a “Simulator”. Click here and this will effectively turn your application on allowing you to talk to it using the “Invocation Display Name” created in step 1. If you don’t have a device, you can use your computer’s mic or just type in some simulated voice commands.

If everything works as designed, you’re ready to publish your app! This step is fairly straightforward. Under the “Deploy” section you’ll click “Release” and then “Submit for Alpha”. Although you’ll most likely need to fill in some more application information (small app icon, your name, email address, link to privacy policy, etc.) before submitting.

At this point (+/- some deploy time), your app can be accessed from up to 20 alpha testers identified by their device-to-google-id pairing. This means you won’t have to open up the simulator for your new app to be discoverable using your Invocation Name. Pretty cool!

Happy meme’ing!

Final Thoughts

This just scratches the surface for what’s possible with “Internet-of-Things” patterns and serverless computing. The Fn Project, both Fn and Flow, are great tools for local (or cloud) processing using a simple Docker-native workflow.

I welcome any and all feedback, issues with this post, or just a simple hello and tell us where you are on your serverless journey!

I promise, we won’t bite! (unless you take our cake)

--

--

Chad Arimura
Fn Project

Former founder & CEO, Iron.io, now VP Serverless Advocacy at Oracle. Programmer, cover band keyboardist.